Find broken links using Selenium Webdriver

Author - Manpreet Singh
19.04.2019
|
0 Comments
||

Find broken links using Selenium Webdriver

A broken link or dead link occurs when a user tries to reach a website page that couldn’t be found on the server. A standard “404 Not Found error” message displayed on the screen.

There are a number of common causes of broken links:
1. When web developers delete a linked page, place an incorrect URL in a text link, or change a URL address without correctly updating the URL.
2. External websites that have gone offline.
3. Web server is down.
4. Firewall or geolocation restriction does not allow outside access.

Why should you check Broken links?
When a broken link is clicked, an error message is displayed. Hence, these broken links potentially lead to a bad impression and unprofessional image in the mind of website viewers.
1. It’s a source of frustration for the user.
2. It gives the impression that you don’t do regular housekeeping.
3. It can harm your credibility as an authority.
4. It could affect your website conversion if it’s on a sales page.
5. It can hurt your SEO and page rankings.

Steps to check the broken Links in a website using Selenium Webdriver:
1. Go to the Website URL to check the broken links.
2. Get all the links in the web page based on <a> tag.
3. Send HTTP request for the link and read HTTP response code.
4. Find out whether the link is valid or broken based on HTTP response code.
5. Repeat this for all the links captured.
Code:

package Testing;
import java.io.IOException;
import java.net.HttpURLConnection;
import java.net.MalformedURLException;
import java.net.URL;
import java.util.Iterator;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;

public class Test {
	public WebDriver driver;
	int respCode = 200;

	public void browser() throws InterruptedException
	{	
		//launch the Google Chrome browser	
		System.setProperty("webdriver.chrome.driver", "C:/chromedriver.exe");
		driver = new ChromeDriver();	
		driver.manage().window().maximize();

		//Navigate to the website url
		driver.get( "https://webnersolutions.com/");
		Thread.sleep(3000);

		//Get the list of links on the website
		List links = driver.findElements(By.tagName("a"));
		Iterator allLinks= links.iterator();

		//Identifying and Validating URL
		while(allLinks.hasNext()){
			String url = allLinks.next().getAttribute("href");
			if(url == null || url.isEmpty()){
				System.out.println(url + " URL is either not configured for anchor tag or it is empty");
				continue;
			}

			if(!url.startsWith("https")){
				System.out.println(url + " URL belongs to another domain.");
				continue;
			}
			
			//Send http request
			try {
				HttpURLConnection huc = (HttpURLConnection)(new URL(url).openConnection());
				huc.setRequestMethod("HEAD");
				huc.connect();
				
				//Validating the Links on the website
				respCode = huc.getResponseCode();
				if(respCode >= 400){
					System.out.println(url + "  is a broken link");
				}
				else{
					System.out.println(url+ "   is a valid link");
				}
			} catch (MalformedURLException e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			} catch (IOException e) {
				// TODO Auto-generated catch block
				e.printStackTrace();
			}
		}
		
		//Closing the browser
		System.out.println("Closing the browser");
		driver.close();
	}

	public static void main(String arg[]) throws InterruptedException{
		Test call = new Test();
		call.browser();
	}
}

Result:

Webner Solutions is a Software Development company focused on developing Insurance Agency Management Systems, Learning Management Systems and Salesforce apps. Contact us at dev@webners.com for your Insurance, eLearning and Salesforce applications.

Leave a Reply

Your email address will not be published. Required fields are marked *