Articles by FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
FavTutor
  • AI News
  • Data Structures
  • Web Developement
  • AI Code GeneratorNEW
  • Student Help
  • Main Website
No Result
View All Result
Articles by FavTutor
No Result
View All Result
Home Data Structures and Algorithms

Find Median from Data Stream (C++, Java, Python)

Jatin Chanana by Jatin Chanana
March 1, 2024
Reading Time: 7 mins read
Find Median from Data Stream
Follow us on Google News   Subscribe to our newsletter

The median is used to describe the central tendency of a dataset. In finding the median from the data stream problem we have to implement a class, MedianFinder, to dynamically compute the median of a data stream, handling even and odd list sizes accurately. In this article, we will discuss an approach to solve this problem efficiently to find median and also discuss the time and space complexity of this problem.

Find Median from the Data Stream Problem

In this problem to find median from the data stream, we have to write a Median finder class that manages an ordered integer list, The median, the middle value of the list, is calculated differently for lists with even or odd sizes

In this problem, integers are added dynamically to a list of integers, and different functions are used to insert numbers into the list a function to find the median from the list of the current list of integers.

There are two cases:

Even: if the size of the list is even then there is no middle value, and the median is the mean of the two middle classes.

For even size

Odd: if the size of the list is odd then the middle value is the median for the current list of integers.

For odd size

We have to implement the following things to solve this problem:

  • MedianFinder():  initializes the MedianFinder object.
  • addNum(int num): adds the integer num from the data stream to the data structure.
  • double findMedian(): returns the median of all elements so far. Answers within 10-5 of the actual answer will be accepted.

Find Median using Heap

In this method, we divide our numbers into two groups: low numbers and high numbers. We want to quickly find the middle value (the median) without having to sort all the numbers.

  1. Firstly, we divide the numbers into two halves: one holds low numbers, and the other holds high numbers. To quickly access the median in constant time, we use maxHeap and minHeap to get the maximum of the low half and the minimum of the high half, respectively.
  2. We will use maxHeap to store half of the low numbers, the top of the maxHeap is the highest number among low numbers, minHeap to store half of the high numbers, top of the minHeap is the lowest number among high numbers.
  3. We have to balance the size between maxHeap and minHeap while processing. Hence after adding k elements,
    • If k = 2 * i then maxHeap.size = minHeap.size = i
    • If k = 2 * i + 1, let maxHeap store 1 element more than minHeap, then maxHeap.size = minHeap.size + 1.
  4. When we add a new number to our MedianFinder:
    • First, we add the number to the maxHeap. This might result in the maxHeap containing a larger number than should be in the minHeap. To correct this, we remove the highest element from the maxHeap and add it to the minHeap.
    • Next, if the minHeap has more elements than the maxHeap, we balance the sizes by removing the lowest element from the minHeap and adding it back to the maxHeap.
  5. When doing findMedian():
    • If maxHeap.size > minHeap.size returns to the top of the maxHeap, which is the highest number among low numbers.
    • Else if maxHeap.size == minHeap return the (maxHeap.top() + minHeap.top()) / 2.

C++ Code

Here is the C++ implementation of this solution:

#include <iostream>
#include <queue>
#include <vector>
using namespace std;

class MedianFinder {
public:
    priority_queue<int> maxHeap; // Max heap to store smaller half of numbers
    priority_queue<int, vector<int>, greater<int>> minHeap; // Min heap to store larger half of numbers
    
    MedianFinder() {
    }
    
    void addNum(int num) {
        maxHeap.push(num); // Add the number to maxHeap
        
        // Move the top element of maxHeap to minHeap
        minHeap.push(maxHeap.top());
        maxHeap.pop();
        
        // If the minHeap has more elements, balance the heaps
        if (minHeap.size() > maxHeap.size()) {
            maxHeap.push(minHeap.top());
            minHeap.pop();
        }
    }
    
    double findMedian() {
        if (maxHeap.size() > minHeap.size()) 
            return maxHeap.top(); // If maxHeap has more elements, median is at the top of maxHeap
        return (maxHeap.top() + minHeap.top()) / 2.0; // Otherwise, average the tops of both heaps for the median
    }
};

int main() {
    MedianFinder obj;
    obj.addNum(1);
    obj.addNum(2);
    cout << obj.findMedian() << endl; // Output: 1.5
    obj.addNum(3);
    cout << obj.findMedian() << endl; // Output: 2.0
    return 0;
}

Python Code

We can do in Python programming also to find median:

import heapq

class MedianFinder:
    def __init__(self):
        self.maxHeap = [] # Max heap to store smaller half of numbers
        self.minHeap = [] # Min heap to store larger half of numbers
    
    def addNum(self, num):
        heapq.heappush(self.maxHeap, -num) # Add the number to maxHeap
        
        # Move the top element of maxHeap to minHeap
        heapq.heappush(self.minHeap, -heapq.heappop(self.maxHeap))
        
        # If the minHeap has more elements, balance the heaps
        if len(self.minHeap) > len(self.maxHeap):
            heapq.heappush(self.maxHeap, -heapq.heappop(self.minHeap))
    
    def findMedian(self):
        if len(self.maxHeap) > len(self.minHeap): 
            return -self.maxHeap[0] # If maxHeap has more elements, median is at the top of maxHeap
        return (-self.maxHeap[0] + self.minHeap[0]) / 2.0 # Otherwise, average the tops of both heaps for the median

# Example Usage
obj = MedianFinder()
obj.addNum(1)
obj.addNum(2)
print(obj.findMedian()) # Output: 1.5
obj.addNum(3)
print(obj.findMedian()) # Output: 2.0

Java Code

Well, let’s also solve it with Java:

import java.util.PriorityQueue;

class MedianFinder {
    PriorityQueue<Integer> maxHeap; // Max heap to store smaller half of numbers
    PriorityQueue<Integer> minHeap; // Min heap to store larger half of numbers
    
    MedianFinder() {
        maxHeap = new PriorityQueue<>((a, b) -> b - a);
        minHeap = new PriorityQueue<>();
    }
    
    public void addNum(int num) {
        maxHeap.offer(num); // Add the number to maxHeap
        
        // Move the top element of maxHeap to minHeap
        minHeap.offer(maxHeap.poll());
        
        // If the minHeap has more elements, balance the heaps
        if (minHeap.size() > maxHeap.size()) {
            maxHeap.offer(minHeap.poll());
        }
    }
    
    public double findMedian() {
        if (maxHeap.size() > minHeap.size()) 
            return maxHeap.peek(); // If maxHeap has more elements, median is at the top of maxHeap
        return (maxHeap.peek() + minHeap.peek()) / 2.0; // Otherwise, average the tops of both heaps for the median
    }

    public static void main(String[] args) {
        MedianFinder obj = new MedianFinder();
        obj.addNum(1);
        obj.addNum(2);
        System.out.println(obj.findMedian()); // Output: 1.5
        obj.addNum(3);
        System.out.println(obj.findMedian()); // Output: 2.0
    }
}

Output:

1.5
2

Complexity Analysis

The Time complexity is O(log n) for adding a number and O(1) time complexity for finding the median in a data stream, where n is the number of elements added. Overall time complexity is dominated by adding numbers. The space complexity of maintaining these priority queues is O(n), where n is the number of elements added to the MedianFinder.

Conclusion

In conclusion, we have discussed the find median in the data stream problem from leetcode and we solved it using heaps and discussed how functions can be implemented using heaps and also discussed implementation in C++, Java, and Python. With time and space complexity

ShareTweetShareSendSend
Jatin Chanana

Jatin Chanana

I am Jatin, a final-year undergraduate student with a keen interest in the field of software development. I possess a strong academic background and a deep passion for Data Structures, Algorithms, and Web Development.

RelatedPosts

validate ip address

Validate IP Address Problem ( C++, JAVA, Python)

April 3, 2025
Shortest Palindrome Problem

Shortest Palindrome Problem ( C++, JAVA, Python)

April 4, 2025
Zigzag Conversion Problem

Zigzag Conversion Problem (C++, Java, Python)

April 4, 2025
Next Right Pointers In Each Node

Populating Next Right Pointers In Each Node (with code)

March 27, 2024
Coin Change II

Coin Change II Problem (C++, Java, Python)

April 7, 2025

About FavTutor

FavTutor is a trusted online tutoring service to connects students with expert tutors to provide guidance on Computer Science subjects like Java, Python, C, C++, SQL, Data Science, Statistics, etc.

Categories

  • AI News, Research & Latest Updates
  • Trending
  • Data Structures
  • Web Developement
  • Data Science

Important Subjects

  • Python Assignment Help
  • C++ Help
  • R Programming Help
  • Java Homework Help
  • Programming Help

Resources

  • About Us
  • Contact Us
  • Editorial Policy
  • Privacy Policy
  • Terms and Conditions

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.

No Result
View All Result
  • AI News
  • Data Structures
  • Web Developement
  • AI Code Generator
  • Student Help
  • Main Website

Website listed on Ecomswap. © Copyright 2025 All Rights Reserved.