DISTRIBUTED SYSTEMS (CSE-3261) MINI PROJECT REPORT ON Publisher Subscriber Architecture using ZeroMQ & Redis SUBMITTED TO Department of Computer Science & Engineering by Jainam Gandhi 200905253 C-49 Nish Patel 200905245 C-47 Parth Khurana 200905236 C-43 Yash Dhaga 200905274 C-51 Name & Signature of Evaluator 1 Name & Signature of Evaluator 2 (Jan 2023 – May 2023) Table of Contents Page No Chapter 1 INTRODUCTION 3 1.1 Introduction to Publisher Subscriber Architecture 3 1.2 Introduction to ZeroMQ 3 1.3 What is the Redis? 4 Chapter 2 BACKGROUND THEORY and/or LITERATURE REVIEW 4 2.1 General Mechanism of Pub-Sub Model 4 2.2 Integration of ZeroMQ and Redis 5 Chapter 3 METHODOLOGY 6 Chapter 4 RESULTS AND DISCUSSION 12 Chapter 5 CONCLUSIONS AND FUTURE ENHANCEMENTS 13 REFERENCES 14 2 1. INTRODUCTION This mini project aims to implement the Publisher-Subscriber Architecture using Python ,ZeroMQ and Redis. 1.1 Introduction to Publisher Subscriber Architecture The publisher-subscriber architecture, also known as the pub/sub architecture, is a messaging pattern used in software design to enable communication between different components of a system. It is a type of event-driven architecture that allows decoupling of components by removing the direct coupling between the sender and receiver of messages. In the publisher-subscriber architecture, there are two main actors: the publisher and the subscriber. The publisher is responsible for generating and broadcasting messages, while the subscriber receives and processes those messages. This enables asynchronous communication between components and allows for scalability, flexibility, and modularity in the system. 1.2 Introduction to ZeroMQ ZeroMQ (also known as ØMQ) is a messaging library that provides a lightweight, low-level communication protocol for distributed and concurrent applications. ZeroMQ is designed to be fast, scalable, and reliable, and it supports a variety of messaging patterns, including the publish-subscribe pattern. ZeroMQ is often used in high-performance computing, financial trading systems, and other applications that require low-latency, high-throughput messaging. One of the key benefits of ZeroMQ is its ability to handle large volumes of data with low overhead, making it an efficient choice for applications that require high-performance messaging. ZeroMQ supports a variety of programming languages, including C, C++, Python, Java, and many others, and it runs on a wide range of operating systems. ZeroMQ is also designed to be highly customizable, with a modular architecture that allows developers to build custom messaging patterns and protocol 3 1.3 What is the Redis? Redis (Remote Dictionary Server) is an open-source, in-memory data structure store that can be used as a database, cache, and message broker. Redis supports a variety of data structures, including strings, hashes, lists, sets, and sorted sets, and provides a number of advanced features such as transactions, pub/sub messaging, and Lua scripting. Redis is designed to be fast, scalable, and highly available, making it an ideal choice for applications that require high-performance data storage and retrieval. Because Redis stores data in memory, it is able to deliver extremely fast read and write performance, especially for frequently accessed data. One of the key features of Redis is its support for pub/sub messaging, which allows applications to send messages to multiple subscribers in a scalable and efficient way. Redis pub/sub messaging can be used to build real-time applications such as chat rooms, real-time analytics, and data streams. Redis is commonly used in a variety of applications, including e-commerce, social networking, and real-time analytics. It is available as a self-hosted solution or as a managed service in the cloud, and it supports a variety of programming languages, including Python, Java, and C#. 2. BACKGROUND THEORY and/or LITERATURE REVIEW 2.1 General Mechanism of Pub-Sub Model In this model, there are two main actors: the publisher and the subscriber. The publisher is responsible for generating and broadcasting messages, while the subscriber receives and processes those messages. The general mechanism of the publisher-subscriber model involves the following steps: 1. The publisher generates a message and sends it to a message broker or middleware, which serves as an intermediary that receives and routes messages between the publisher and the subscribers. 2. The message broker then delivers the message to all interested subscribers that have registered to receive messages of a certain type or topic. 3. The subscribers receive the message and process it according to their specific requirements. 4. If necessary, the subscribers can send a response back to the publisher or another component of the system. One of the key advantages of the publisher-subscriber model is that it allows for asynchronous communication between components. Publishers can broadcast messages to multiple subscribers without having to wait for a response, enabling parallel processing and improved performance. 4 2.2 Integration of ZeroMQ and Redis ZeroMQ and Redis can be integrated to create a powerful messaging system that combines the high performance and low latency of ZeroMQ with the persistence and durability of Redis. The integration of ZeroMQ and Redis for the publisher-subscriber model can be achieved in the following way: 1. The publisher generates a message and sends it to a ZeroMQ socket. 2. ZeroMQ routes the message to Redis, where it is stored in a Redis list or channel. 3. Interested subscribers subscribe to the Redis channel or list and receive any new messages that are published to it. 4. When a subscriber receives a message, it processes it as required. 5. If necessary, the subscriber can send a response back to the publisher or another component of the system through a ZeroMQ socket. This integration provides several advantages over using either ZeroMQ or Redis alone. By using Redis, messages can be persisted, allowing subscribers to retrieve missed messages or recover from system failures. At the same time, ZeroMQ provides fast, low-latency message delivery, enabling high-performance communication between components. Additionally, Redis can be used as a message broker for multiple ZeroMQ publishers and subscribers, allowing for a scalable and distributed messaging system. ZeroMQ and Redis can be integrated to create a powerful messaging system that combines the high performance and low latency of ZeroMQ with the persistence and durability of Redis. Overall, the integration of ZeroMQ and Redis provides a robust, high-performance messaging system that is well-suited to a variety of use cases. 5 3. METHODOLOGY The implementation of a publisher-subscriber model depends on the specific messaging system and tools being used. However, the general steps involved in implementing a publisher-subscriber model are as follows: 1. Design the message format: Decide on the format of the messages that will be sent between the publishers and subscribers. This may include the data to be sent, the message structure, and any metadata that needs to be included. 2. Choose a messaging system: Select a messaging system that supports the publisher-subscriber model, such as ZeroMQ, Redis, or MQTT. The choice of messaging system will depend on factors such as performance, scalability, reliability, and ease of use. 3. Set up publishers and subscribers: Create instances of the publisher and subscriber components and configure them to use the chosen messaging system. The publisher should be configured to generate messages and publish them to the messaging system, while the subscriber should be configured to receive messages from the messaging system. 4. Define message topics: Define the topics or channels that messages will be published to and subscribed from. Topics can be used to categorize messages and enable subscribers to selectively receive messages of interest. 5. Implement message processing: Define the actions that will be taken by the subscribers when they receive messages. This may include processing the message data, updating application state, triggering other actions, or sending responses back to the publishers. 6. Test and refine: Test the implementation and refine as needed to ensure that messages are being delivered correctly and that the system is performing as expected. 6 Some tools and frameworks provide built-in support for the publisher-subscriber model, making it easier to implement. For example, in the case of ZeroMQ, the zmq library provides a set of highlevel abstractions that simplify the creation of publishers and subscribers, as well as the configuration of message topics and message processing. Message.py: from redis_connector import redis_connection def get_topic_msg_key(topic, msg_id, delimiter=":"): key = 'TOPIC{delimiter}{topic}{delimiter}MESSAGE{delimiter}{msg_id}'.format( topic=topic, msg_id=msg_id, delimiter=delimiter ) return key def get_topic_msg_read_key(topic, msg_id, sub_id, delimiter=":"): key = 'SUBSCRIBER{delimiter}{sub_id}{delimiter}READ{delimiter}TOPIC{delimiter}{topic}{ delimiter}MESSAGE{delimiter}{msg_id}'.format( topic=topic, msg_id=msg_id, sub_id=sub_id, delimiter=delimiter ) return key def get_topic_key_pattern(topic, delimiter=':'): key = 'TOPIC{delimiter}{topic}{delimiter}MESSAGE{delimiter}*'.format( topic=topic, delimiter=delimiter ) return key def extract_msg_id_from_topic_msg_key(topic_msg_key): return topic_msg_key[topic_msg_key.rfind(':')+1:] def write_msg(topic, msg_id, msg, ttl=180): msg_key = get_topic_msg_key(topic, msg_id) redis_connection.set(msg_key, msg) redis_connection.expire(msg_key, ttl) def get_msg(topic, msg_id): 7 msg_key = get_topic_msg_key(topic, msg_id) msg = redis_connection.get(msg_key) return msg def get_msg_by_key(topic_msg_key): return redis_connection.get(topic_msg_key) def pop_msg(topic, msg_id): msg_key = get_topic_msg_key(topic, msg_id) msg = redis_connection.get(msg_key) redis_connection.delete(msg_key) return msg def mark_msg_as_read(topic, msg_id, sub_id): topic_msg_read_key = get_topic_msg_read_key(topic, msg_id, sub_id) msg_key = get_topic_msg_key(topic, msg_id) redis_connection.set(topic_msg_read_key, 1) redis_connection.expire(topic_msg_read_key, redis_connection.ttl(msg_key)) def is_msg_read(topic, msg_id, sub_id): topic_msg_read_key = get_topic_msg_read_key(topic, msg_id, sub_id) return redis_connection.get(topic_msg_read_key) is not None def get_msgs_for_topic(topic): topic_key_pattern = get_topic_key_pattern(topic) yield from redis_connection.scan_iter(match=topic_key_pattern, count=100) Publisher.py: import zmq import random import sys import time, json, uuid import message as MSG pub = None def connect(host="*", port=5555): global pub 8 if pub is None: context = zmq.Context() pub = context.socket(zmq.PUB) pub.bind("tcp://{}:{}".format(host, port)) return pub def publish(topic, msg): msg_id = str(uuid.uuid4()) MSG.write_msg(topic, msg_id, msg) pub.send_string("{} {}".format(topic, msg_id)) if __name__ == "__main__": port = int(input("Enter port: ")) connect(port=port) while True: topic = input("Enter topic name: ") msg = input("Enter message: ") if msg in ('stop', 'exit'): break publish(topic, msg) Subscriber.py: import sys import zmq import message as MSG import settings sub_id = None def process_raw_msg(topic, msg): print("Received on topic {}: {}".format(topic, msg)) def fetch_and_process_msg(topic, msg_id): global sub_id if not MSG.is_msg_read(topic, msg_id, sub_id): 9 msg = MSG.get_msg(topic, msg_id) if msg is not None: process_raw_msg(topic, msg) MSG.mark_msg_as_read(topic, msg_id, sub_id) else: print("Message has already been read!") def process_pending_msgs(*topics): for topic in topics: for topic_msg_key in MSG.get_msgs_for_topic(topic): topic_msg_key = topic_msg_key.decode() msg_id = MSG.extract_msg_id_from_topic_msg_key(topic_msg_key) print("Pending message found on topic {}".format(topic)) fetch_and_process_msg(topic, msg_id) def subscribe(port, topics=[]): context = zmq.Context() # sub = context.sub(zmq.SUB) sub = context.socket(zmq.SUB) print("Waiting for messages...") sub.connect("tcp://localhost:{}".format(port)) topics.extend(settings.GLOBAL_TOPICS) topics = list(map(lambda t: t.strip(), topics)) for topic in topics: sub.setsockopt(zmq.SUBSCRIBE, topic.encode()) process_pending_msgs(*topics) while True: topic_msg = sub.recv().decode() print("New message: ", topic_msg) for_topic = topic_msg[:topic_msg.find(' ')] msg_id = topic_msg[topic_msg.find(' ') + 1:] fetch_and_process_msg(for_topic, msg_id) 10 if __name__ == "__main__": sub_id = input("Enter subscriber id(alphanumeric): ") port = int(input("Enter port: ")) topics = input("Enter topic names(comma-separated) to subscribe to: ") subscribe(port, topics.split(",")) Redis_connector.py: import redis import settings print ("Initialization - Redis connection pool") redis_conn_pool = redis.ConnectionPool(host=settings.REDIS_CONF.get('REDIS_HOST', 'localhost'), port=settings.REDIS_CONF.get('REDIS_PORT', 6379), db=settings.REDIS_CONF.get('REDIS_DB', 0), password=settings.REDIS_CONF.get('REDIS_PASSWORD', '')) redis_connection = redis.Redis(connection_pool=redis_conn_pool) Settings.py: GLOBAL_TOPICS = ["GLOBAL","GENERAL"] REDIS_CONF = { 'REDIS_HOST': '127.0.0.1', 'REDIS_PORT': 6379, 'REDIS_DB': 0 } 11 4. RESULTS AND DISCUSSION 12 5. CONCLUSIONS AND FUTURE ENHANCEMENTS, IF ANY In conclusion, the publisher-subscriber model is a messaging pattern used in software architecture to enable communication between different components of a system. It involves two main actors, the publisher and the subscriber, where the publisher generates and broadcasts messages, and the subscriber receives and processes those messages. The publisher-subscriber model enables asynchronous communication between components, allowing parallel processing and improved performance. It also promotes the separation of concerns between components, leading to greater modularity and ease of maintenance. Additionally, the model allows for decoupling between components, making the system more flexible and adaptable to changes. The publisher-subscriber model can be implemented using various messaging systems, such as ZeroMQ and Redis, which can be integrated to create a powerful messaging system that combines high performance, low latency, persistence, and durability. Overall, the publisher-subscriber model is a useful messaging pattern that can be applied to various use cases, providing efficient and flexible communication between components of a system. There are several advancements that can be made to the publisher-subscriber model to further improve its performance, flexibility, and usability. Some of these advancements include: 1. Scalability: One area of improvement for the publisher-subscriber model is in scalability. As systems grow larger and more complex, it becomes more challenging to manage the large volume of messages being exchanged. Techniques such as load balancing and sharding can be used to improve scalability by distributing the load across multiple servers or clusters. 2. Security: Another area for improvement is security. The publisher-subscriber model can be vulnerable to attacks such as man-in-the-middle attacks or unauthorized access to message content. Implementing secure communication protocols such as SSL/TLS can help to address these security concerns. 3. Interoperability: A further area for improvement is interoperability. The publisher-subscriber model can benefit from improved interoperability with other messaging systems and protocols. For example, integrating with other messaging systems such as RabbitMQ or Apache Kafka can enable greater flexibility and expand the range of use cases for the publisher-subscriber model. 4. Real-time capabilities: Another area for improvement is real-time capabilities. Many applications require real-time communication between components, and the publisher-subscriber model can benefit from faster message delivery and reduced latency. Techniques such as using in-memory data structures, optimizing network traffic, and reducing message size can all help to improve real-time capabilities. 13 5. Management and monitoring: Finally, better management and monitoring tools can help to improve the publisher-subscriber model. Advanced management tools can provide greater visibility into message traffic, enable better resource allocation, and facilitate faster response times in the event of system failures. Overall, there is significant potential for advancements in the publisher-subscriber model, with many opportunities for improved performance, security, interoperability, real-time capabilities, and management and monitoring. 6. REFERENCES 1. "Distributed Systems: Principles and Paradigms" by Andrew S. Tanenbaum and Maarten Van Steen. 2. https://ably.com/topic/pub-sub 3. Redis documentation 14