Leetcode #1236: Web Crawler
In this guide, we solve Leetcode #1236 Web Crawler in Python and focus on the core idea that makes the solution efficient.
You will see the intuition, the step-by-step method, and a clean Python implementation you can use in interviews.

Problem Statement
Given a url startUrl and an interface HtmlParser, implement a web crawler to crawl all links that are under the same hostname as startUrl. Return all urls obtained by your web crawler in any order.
Quick Facts
- Difficulty: Medium
- Premium: Yes
- Tags: Depth-First Search, Breadth-First Search, String, Interactive
Intuition
We need to explore a structure deeply before backing up, which suits DFS.
DFS keeps local context on the call stack and is easy to implement recursively.
Approach
Define a recursive DFS that carries the necessary state.
Combine child results as the recursion unwinds.
Steps:
- Define a recursive DFS with state.
- Visit children and combine results.
- Return the final aggregation.
Example
interface HtmlParser {
// Return a list of all urls from a webpage of given url.
public List<String> getUrls(String url);
}
Python Solution
# """
# This is HtmlParser's API interface.
# You should not implement it, or speculate about its implementation
# """
# class HtmlParser(object):
# def getUrls(self, url):
# """
# :type url: str
# :rtype List[str]
# """
class Solution:
def crawl(self, startUrl: str, htmlParser: 'HtmlParser') -> List[str]:
def host(url):
url = url[7:]
return url.split('/')[0]
def dfs(url):
if url in ans:
return
ans.add(url)
for next in htmlParser.getUrls(url):
if host(url) == host(next):
dfs(next)
ans = set()
dfs(startUrl)
return list(ans)
Complexity
The time complexity is O(V+E). The space complexity is O(V).
Edge Cases and Pitfalls
Watch for boundary values, empty inputs, and duplicate values where applicable. If the problem involves ordering or constraints, confirm the invariant is preserved at every step.
Summary
This Python solution focuses on the essential structure of the problem and keeps the implementation interview-friendly while meeting the constraints.