0049. 字母异位词分组【中等】

1. 📝 题目描述
2. 🎯 s.1 - 排序 + 哈希表分组

1. 📝 题目描述

leetcode

给你一个字符串数组，请你将字母异位词组合在一起。可以按任意顺序返回结果列表。

示例 1：

txt

输入: strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
输出: [["bat"], ["nat", "tan"], ["ate", "eat", "tea"]]

解释：

在 strs 中没有字符串可以通过重新排列来形成 "bat"。
字符串 "nat" 和 "tan" 是字母异位词，因为它们可以重新排列以形成彼此。
字符串 "ate"，"eat"和 "tea" 是字母异位词，因为它们可以重新排列以形成彼此。

示例 2：

txt

输入: strs = [""]
输出: [[""]]

示例 3：

txt

输入: strs = ["a"]
输出: [["a"]]

提示：

1 <= strs.length <= 10^4
0 <= strs[i].length <= 100
strs[i] 仅包含小写字母

2. 🎯 s.1 - 排序 + 哈希表分组

cjspy

// 用排序后的字符串作为 key，通过简单哈希表（开放寻址）实现分组
#define HASH_SIZE 20011

typedef struct Entry {
    char key[110]; // 排序后的 key
    int* indices;  // 原字符串下标列表
    int count;
    int cap;
} Entry;

static Entry table[HASH_SIZE];
static int used[HASH_SIZE];

int charCmp(const void* a, const void* b) {
    return *(char*)a - *(char*)b;
}

static unsigned int hashStr(const char* s) {
    unsigned int h = 5381;
    while (*s)
        h = h * 33 + (unsigned char)*s++;
    return h % HASH_SIZE;
}

/**
 * Return an array of arrays of size *returnSize.
 * The sizes of the arrays are returned as *returnColumnSizes array.
 * Note: Both returned array and *columnSizes array must be malloced, assume
 * caller calls free().
 */
char*** groupAnagrams(char** strs, int strsSize, int* returnSize,
                      int** returnColumnSizes) {
    memset(used, 0, sizeof(used));

    for (int i = 0; i < strsSize; i++) {
        char key[110];
        strcpy(key, strs[i]);
        int len = strlen(key);
        qsort(key, len, sizeof(char), charCmp);

        unsigned int h = hashStr(key);
        while (used[h] && strcmp(table[h].key, key) != 0)
            h = (h + 1) % HASH_SIZE;

        if (!used[h]) {
            used[h] = 1;
            strcpy(table[h].key, key);
            table[h].cap = 4;
            table[h].count = 0;
            table[h].indices = (int*)malloc(4 * sizeof(int));
        }
        if (table[h].count == table[h].cap) {
            table[h].cap *= 2;
            table[h].indices =
                (int*)realloc(table[h].indices, table[h].cap * sizeof(int));
        }
        table[h].indices[table[h].count++] = i;
    }

    // 统计组数
    int groups = 0;
    for (int i = 0; i < HASH_SIZE; i++)
        if (used[i])
            groups++;

    char*** ans = (char***)malloc(groups * sizeof(char**));
    *returnColumnSizes = (int*)malloc(groups * sizeof(int));
    int g = 0;
    for (int i = 0; i < HASH_SIZE; i++) {
        if (!used[i])
            continue;
        int cnt = table[i].count;
        ans[g] = (char**)malloc(cnt * sizeof(char*));
        (*returnColumnSizes)[g] = cnt;
        for (int j = 0; j < cnt; j++)
            ans[g][j] = strs[table[i].indices[j]];
        free(table[i].indices);
        g++;
    }
    *returnSize = groups;
    return ans;
}

/**
 * @param {string[]} strs
 * @return {string[][]}
 */
var groupAnagrams = function (strs) {
  const map = new Map()
  for (const s of strs) {
    const key = s.split('').sort().join('')
    if (!map.has(key)) map.set(key, [])
    map.get(key).push(s)
  }
  return [...map.values()]
}

class Solution:
    def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
        groups: dict[str, List[str]] = defaultdict(list)
        for s in strs:
            key = "".join(sorted(s))
            groups[key].append(s)
        return list(groups.values())

时间复杂度： $O (n \cdot k \log k)$ ，其中 $n$ 是字符串数量， $k$ 是字符串的最大长度，每个字符串排序需 $O (k \log k)$
空间复杂度： $O (n \cdot k)$ ，哈希表存储所有字符串及其排序后的 key

算法思路：

互为字母异位词的字符串，排序后结果相同，因此用排序后的字符串作为哈希表的 key
遍历每个字符串，将其排序得到 key，追加到哈希表对应分组中
最终返回哈希表所有值（即各分组列表）

0049. 字母异位词分组【中等】

0049. 字母异位词分组【中等】 ​

1. 📝 题目描述 ​

2. 🎯 s.1 - 排序 + 哈希表分组 ​

Discussions

0049. 字母异位词分组【中等】

1. 📝 题目描述

2. 🎯 s.1 - 排序 + 哈希表分组