物联网web服务论文review:内容预定服务 IoT Web Service: Content Based Subscribe Service Paper Review

Author: Zizhun Guo

作者:

写于:


Announcement:

This a review paper on “COSS: Content-Based Subscription as an IoT Service”. Used only for Spring RIT CSCI-724 course.

Reprint without authorization is forbidden Zizhun Guo@all rights reserved





Abstract—This is a reviewing paper generally discussing the understanding of the Content-Based Subscribe System (COSS) using in the Internet of Things (IoT). This paper also talks about the relationship between “as a service”, the application design pattern and the business application status quo in IoT with the purpose and design goal of the traditional publish/subscribe system.

Keywords—IoT, Web Service, Publish/Subscribe, Rule Set, Content-based subscribe

I. INTRODUCTION TO BACKGROUNDS It was becoming long popular for the adoption of the concept of “as a service”. From software as a service to infrastructure as a service and the platform as a service. What once being prevalent as the business model that a dealership to have the products purchased and installed permanently but to have additional payment if demands on more services, is out of date in nowadays and has been replaced by the “as a service” new model. Internet of Things(IoT), as one of the most rising discussed business topics in the IT industry, has also adapted to the current “as a service” business model. Since the cost for IoT devices is small, wireless technologies are mature and cloud innovation is fully supportive, digitally connected products can be monitored and managed remotely in real-time. This allows the data produced by the IoT devices massively collected and utilized by the manufacturers and customers. As the concept of “as a service” conveys, the customer need not handle all functional demands to support their main business logic by unnecessarily implementing data filtering module for receiving the real interesting data or implementing a routing protocol to enhance data collection efficiency, these functionalities can be implemented by other organizations such as the solution providing department from the device manufacturer or other cloud service providers. This idea can be realized through publish-subscribe systems in the form of Restful Web Services.

II. TRADITIONAL PUB/SUB SERVICE VS CONTENT-BASED SUBSCRIPTION SERVICE A. Publish-Subscribe Service Publish/Subscription communication pattern is one software design pattern. It requires a publisher that can be any source of data to push out the messages in which subscribers are interested [3]. In Publish/Subscribe system adopted in the IoT case, the publishers are those sensors within the IoT devices, and the subscribers are the user application or other services, providers. The messages are wrapped as HTTP packages in the format of the JSON file since JSON has smaller size in storage compared with the XML format. However, by directly sending the message without further defining its communicating behavior would be not enough. Such functions as the content filtering, historical data retaining and data delivering frequency controlling need to implement through policies. Therefore, Publish/Subscribe services are designed topic-based. As the message has its data structure called “topic”, it enables the information can be transferred between publishers and subscribers in a uniformed format. The process is initiated by the publishers who multicasting the announcement message specifying the topic information and its policy service (rule), if the subscribers received the messages which satisfy its requirement, the connection would be established [2]. After that, various behaviors can be enforced by policy services. B. Content-Based Subscription Service(COSS) Though the traditional publish-subscribe systems (topic-based services) can have the topic defined for specific message sending and receiving like the interfaces, the core business logic still requires the subscriber application implement by itself. To provide the additional complex function for the subscriber applications, this system provides more flexibility to communicate through modified messages services containing the user-selected attributes instead of the raw topic directly commuted between publishers and subscribers. In some scenarios, the subscriber application is only interested in the particular attribute of the data in the topic like an IoT device in duty for detecting the abnormal indoor temperature to warm the house residency, the application does not need information like monitoring video signal or voice recording signal, so the extra information can be ignored by only selecting attributes like spatial location and temperature of course. Besides, receiving all data messages would reduce efficiency, so Content-Based Subscription Services can express more complex rules on the content of the messages. III. THE HIGH THROUGHPUT ISSUES In traditional network publisher/subscriber systems, the rules provided by the policy services scaled well to conduct a few milliseconds of matching time with millions of attribute constraints. But when it comes to IoT scenarios, it would not be the same case. When the rules’ number gets to thousands, communication suffers the issue of high data throughput [1]. Especially for the topic as sensor types, most messages are in the type of timestamps with a massive amount of sensors with low sampling rates at the same time. Another case is the adequate number of sensors that send high sampling rates. Both two cases would cause a data throughput issue. To solve the issues addressed above, this paper introduced a comprehensive solution focused on two aspects: feasibility of technical migration on current infrastructure and algorithm to solve the throughput issue on the application level. Because of the first aspect of issues, the authors of paper intent not to reestablish a new hardware infrastructure standard to make rules that hardly any of IoT devices released have the ability to follow, but prone to solve this technical issue in the application level. Based on this reason, as it is mentioned in the previous part, they design the COSS as a RESTful web service. The web service sends and receives messages through HTTP packages, so there would be no low-level implementation involved, which makes the migration cost lower than building the infrastructure from scratch. For the second aspect, the COSS needs to enforce rules based middleware model including the service providers’ engine to handle such issue. Thus, some terminologies related to the combination of techniques are mainly discussed in this paper. IV. TMR MODEL AND THE SERVICE On the traditional topic-based subscription system, there has no throughput issue, either for the adequate number of sensors with a high sampling rate or the massive number of sensors with a low sampling rate [1]. It becomes the problem if the current system wishes to apply the more strict content filtering criteria on data source since the middleware of the system cannot handle it, whereas the demands can only be satisfied in the application program, in which IO pressure ascends [1]. The middleware that is in charge of filtering the contents is Rule Topic component in the TMR model. The full name of the TMR model writes as Tenant-Message-Rule, which consists of three essential parts. From the left-hand side to the right in TMR, the Tenant represents the multi-tenant structure, the Message represents the data message sending from between publisher/subscribers with the model engine and the Rule represents the rule topic that works for setting the rule to filter out the information based on the demanding content.

Figure 1: TMR model A. Single-tenant vs Multi-tenant The difference between the two structures of publishers and subscribers is by the number of supporting customers. As single-tenant supports only a single instance of software and database, there is no way for customers to convey the message at a low level. While the multi-tenant service means the software and database can serve multiple customers but each customer in between remains invisible and isolated [4]. The benefit of what the COSS employs points out that this approach enjoys much lower cost in both economies of scale and operations. It has been widely used by most IoT services. Most importantly, this is the way to migrate to COSS to the current infrastructure with minimization of migration cost of service decomposition. B. Message It is reasonable to have the message packed as in JSON format, since for most IoT equipment, due to the limited data transferring ability by the less complicated hardware comparing to most server machines, JSON has a smaller size and flexibly defined of data file than XML. But in the case of COSS, the message collected by tenants must be defined in different schemas. The purpose of doing this is to prepare the works for latter layers on filtering the data in Rule topics. C. Rule topic This part of the TMR model works for setting up the filter criteria for the data messages sent from the schema. Here we skip the data sources since it is considered as part of a rule topic that provides only the locations in which date messages come from. So, in the figure, it is understandable to see the matching relationship between Schemas, Data Source, and Rule Topics is a structure of N to M, in which N is much smaller than M. This mechanism permits the same data messages applies to multiple rule topics so that avoiding duplication which increasing the data reusability that raises the throughput anticipation. Unlike the topic-based service, which defines only one topic per service, the COSS allows multiple rule topics grouped as a set for subscriber application to select. As one part of the middleware, rules sets specifically works for filter criteria. D. API Design For specific API design, please review the table 1 down below:

Table 1: TMR model Here follows the purpose of each method: 1) /service: This method is particularly for setting the service configuration initiated by the tenants. 2) /schemas: This method provides the tenants the way to modify the schemas through JSON document. 3) /datasource: The method for tenants to add or delete the data source. 4) /ruletopics: This method allows the tenants to create or delete rule topics to filter the data. E. Service Overall

Figure 2: COSS architecture From figure 2, tenants which hold the server and provide the service to subscriber applications can set up the ruleset and specify the data source. The data published by the data sources will be dispatched by the dispatch nodes (DN) on the COSS platform and applied on different rule sets by the rule nodes (RN), where the rule engine will process the messages and filter them out. At the last step, the filtered data would be exited from the exit nodes (EN) and be responded to the subscriber applications. V. THE DISTRIBUTION ALGORITHM There must be a specific algorithm to deal with such a TMR model in order to make it workable under the theory. Let alone based on the paper, it defines such rules set assignment problem is NP-hard. Thus, the paper introduces a heuristic algorithm named Balanced Rule Engine Partitioning (BREP) method which can adjust the workload of distribution according to its workload history. There are pages of problem statements to prove the problem as an NP-hard problem and with their method in pseudocode illustrated, the point, however, is to partition the rules into different rule engines. There it comes to the APIs design for COSS, for the purpose of further discussion on the reason why BREP works properly on the model logics. It all because of the API design that under the TMR model, services are exposed by servers to tenants so that the tenants can control COSS through APIs to enable the content-based subscription for users while the core logic is implemented by a stream computing platform held by some cloud computing companies. The point is all middleware computing discussed above are managed in such a platform. It is inevitable to handle such issues about patching the data messages coming from different tenants on the platform to be grouped into different rule engines in real-time with high efficiency that would not trigger high throughput issues. So, the deep understanding of this paper requires both implementation on RESTful APIs development and BREP algorithm, of which throughput handling is built upon a higher level comparing to our implementation paper.

[1] C. Yaoliang, w. Jingjing, W. Hongwei, S. Huang and C.Lin. “COSS: Content-based Subscription as an IoT Service,” 2015 IEEE International Conference on Web Services. [2] C. Sara Granados, Electronic Design, Mar 2018. Accessed on: April. 5,2020.[Online].Available:https://www.electronicdesign.com/markets/automotive/article/21806330/use-a-datacentric-publishsubscribe-framework-for-iot-applications [3] “What is Publish-Subscribe (Pub/Sub)?” Accessed on: April. 5, 2020. [Online]. Available: https://www.pubnub.com/learn/glossary/what-is-publish-subscribe/ [4] F. Ian, Electronic Design, Data Insider July 2019. Accessed on: April. 5,2020.[Online].Available: https://digitalguardian.com/blog/saas-single-tenant-vs-multi-tenant-whats-difference

Back to Top