Saturday, April 29, 2023

Concept of Hashing & Collision Resolution Techniques

 

Concept of Hashing & Collision Resolution Techniques

1️⃣ What is Hashing?

🔹 Hashing is a technique used to store and retrieve data efficiently using a hash function.
🔹 Instead of searching through an array, hashing directly maps a key to an index in a hash table.
🔹 Time Complexity:

  • Best Case: O(1) (Direct access using hash function)
  • Worst Case: O(n) (If many collisions occur)

2️⃣ Hash Table & Hash Function

  • Hash Table: A data structure that stores key-value pairs.
  • Hash Function: A function that converts a key into an index in the hash table.

🔹 Properties of a Good Hash Function

Minimizes Collisions (assigns unique indices)
Evenly distributes data across the table
Efficient computation (Fast to calculate)

✅ Example Hash Function (Modulo Method)


index = key % table_size;

If table_size = 10 and key = 25, then:



index = 25 % 10 = 5

The key 25 is stored at index 5 in the hash table.


3️⃣ What is Collision in Hashing?

🔹 Collision occurs when two different keys get mapped to the same index.
🔹 Example: If both 25 and 35 are mapped to index 5, a collision happens.

How to Resolve Collisions?

🔹 There are multiple Collision Resolution Techniques, including:
1️⃣ Open Addressing (Linear Probing, Quadratic Probing, Double Hashing)
2️⃣ Chaining (Separate Chaining with Linked Lists)


4️⃣ Collision Resolution Techniques

1️⃣ Open Addressing (Storing Elements in the Hash Table Itself)

✔ If a collision occurs, find the next available slot within the table.

🔹 (A) Linear Probing

Finds the next available slot sequentially (next index).
Primary clustering (elements get grouped together).


index = (hash(key) + i) % table_size;

Example: If index = 5 is occupied, try 6, 7, 8, etc.

🔹 (B) Quadratic Probing

Checks in quadratic steps (1², 2², 3², ...).
Secondary clustering may still happen.


index = (hash(key) + i²) % table_size;

Example: If index = 5 is occupied, try 6, 9, 14, etc.

🔹 (C) Double Hashing

Uses a second hash function for resolving collisions.
Avoids clustering completely.


index = (hash1(key) + i * hash2(key)) % table_size;

Example:


hash1(key) = key % 10; hash2(key) = 7 - (key % 7); // Ensures different probing sequence

2️⃣ Chaining (Separate Chaining with Linked Lists)

Each index in the hash table contains a linked list to store multiple values.
Efficient if the number of collisions is high.
Requires extra memory for linked lists.

✅ C Program: Hashing with Chaining


#include <stdio.h> #include <stdlib.h> #define TABLE_SIZE 10 // Structure for linked list node struct Node { int key; struct Node* next; }; // Hash table array of linked lists struct Node* hashTable[TABLE_SIZE]; // Hash function int hashFunction(int key) { return key % TABLE_SIZE; } // Insert function void insert(int key) { int index = hashFunction(key); struct Node* newNode = (struct Node*)malloc(sizeof(struct Node)); newNode->key = key; newNode->next = hashTable[index]; // Insert at beginning hashTable[index] = newNode; } // Search function int search(int key) { int index = hashFunction(key); struct Node* temp = hashTable[index]; while (temp) { if (temp->key == key) return 1; // Key found temp = temp->next; } return 0; // Key not found } // Display function void display() { for (int i = 0; i < TABLE_SIZE; i++) { printf("Index %d: ", i); struct Node* temp = hashTable[i]; while (temp) { printf("%d -> ", temp->key); temp = temp->next; } printf("NULL\n"); } } // Main function int main() { insert(10); insert(20); insert(30); insert(25); insert(35); display(); if (search(25)) printf("Element found\n"); else printf("Element not found\n"); return 0; }

Handles collisions well
Efficient for large datasets
Extra memory required for linked lists


5️⃣ Comparison: Open Addressing vs. Chaining

FeatureOpen AddressingChaining (Linked List)
Memory UsageNo extra memory neededExtra memory for linked lists
SpeedSlower if the table is fullFaster in case of collisions
Load FactorPerformance degrades as load increasesHandles load well
ImplementationComplex but saves spaceSimple but uses more memory

No comments:

Post a Comment

Complete Binary Tree in Data Structures

  Complete Binary Tree in Data Structures 🌳 A Complete Binary Tree (CBT) is a type of Binary Tree where: ✔ All levels except possibly t...