Leetcode #839: Similar String Groups
In this guide, we solve Leetcode #839 Similar String Groups in Python and focus on the core idea that makes the solution efficient.
You will see the intuition, the step-by-step method, and a clean Python implementation you can use in interviews.

Problem Statement
Two strings, X and Y, are considered similar if either they are identical or we can make them equivalent by swapping at most two letters (in distinct positions) within the string X. For example, "tars" and "rats" are similar (swapping at positions 0 and 2), and "rats" and "arts" are similar, but "star" is not similar to "tars", "rats", or "arts".
Quick Facts
- Difficulty: Hard
- Premium: No
- Tags: Depth-First Search, Breadth-First Search, Union Find, Array, Hash Table, String
Intuition
Fast membership checks and value lookups are the heart of this problem, which makes a hash map the natural choice.
By storing what we have already seen (or counts/indexes), we can answer the question in one pass without backtracking.
Approach
Scan the input once, using the map to detect when the condition is satisfied and to update state as you go.
This keeps the solution linear while remaining easy to explain in an interview setting.
Steps:
- Initialize a hash map for seen items or counts.
- Iterate through the input, querying/updating the map.
- Return the first valid result or the final computed value.
Example
Input: strs = ["tars","rats","arts","star"]
Output: 2
Python Solution
class UnionFind:
def __init__(self, n):
self.p = list(range(n))
self.size = [1] * n
def find(self, x):
if self.p[x] != x:
self.p[x] = self.find(self.p[x])
return self.p[x]
def union(self, a, b):
pa, pb = self.find(a), self.find(b)
if pa == pb:
return False
if self.size[pa] > self.size[pb]:
self.p[pb] = pa
self.size[pa] += self.size[pb]
else:
self.p[pa] = pb
self.size[pb] += self.size[pa]
return True
class Solution:
def numSimilarGroups(self, strs: List[str]) -> int:
n, m = len(strs), len(strs[0])
uf = UnionFind(n)
for i, s in enumerate(strs):
for j, t in enumerate(strs[:i]):
if sum(s[k] != t[k] for k in range(m)) <= 2 and uf.union(i, j):
n -= 1
return n
Complexity
The time complexity is O(n). The space complexity is O(n).
Edge Cases and Pitfalls
Watch for boundary values, empty inputs, and duplicate values where applicable. If the problem involves ordering or constraints, confirm the invariant is preserved at every step.
Summary
This Python solution focuses on the essential structure of the problem and keeps the implementation interview-friendly while meeting the constraints.