| Written by Radoslav Dejanović,
on 20-05-2008 00:48
|
Views : 7502  |
Favoured : 238 |
Published in : , English language |
Few days ago I have been told that there's another fast-tracking process going on in ISO. This time, it isn't really about computers or office formats, but the story is interesting never the less; it's about ISO IEC DIS 14908 - Open Data Communication in Building Automation, Controls and Building Management — Control Network Protocol. You see, modern building have changed since the Roman time. Ok, except the concrete that Romans invented is of much better quality than what is used today. Except for being built to last, Roman buildings didn't have those nice devices that would open a garage door for you, open (and operate) an elevator, turn lights on and off... Now, if you say X10, you're on the right track. So, what is the problem with 14908? It's just another standard for having intelligent devices in your walls, so what is a big deal? I have looked at it, and it reminds me of that OOXML vs ODF adventure of ISO. You see, there are already standards that cover everything that this proposal offers. There's, for example, EIB, which is prominently German protocol. It's a little bit older but it is what is considered an open standard. There's a spec, so you can go and build any part of the building automation - be it a controller, an actuator, a sensor, anything. Just keep it within the specification. 14908 looks like it isn't really an open standard. It's patent encumbered, for a beginning. That's royalties imposed on manufacturers and customers, and probably there is going to be a manufacturer that has exclusive rights to produce a piece of hardware that you have to buy if you want this network to operate. Given, this new standard does something better than, for example, EIB: while EIB talk about 9600bps communication speed, and a maximum of 65536 connected devices, and using RS232 to communicate with your PC, 14908 is giving us a nice view of wonderful TCP/IP world. Well, almost. You see, I am not an expert for embedded electronics, nor building automation. I am neither an expert on computer networking and TCP/IP stack, but I do think that I might say that I do know at least *something* about that technology. So I have read the part of 14908 proposal that talks about networking. It looks that the main idea of new proposal is to get rid of separate data buses that is mainly used in older standards for data transmission between the devices, and to use - computer network cabling, the humble UTP. Now, let us think. Is the cost of cabling separate twisted pair for building automation really that bad? One can probably save a lot of money by using the same UTP-based cabling for computer network and building automation... but, would you really like that? I might be very wrong here, and it might turn out that the authors never really considered mixing computer network with CNP network, in which case it would not be such a disaster. But, what if I'm right? Just to be sure that I'm not talking complete nonsense here, I have asked few of my friends who I consider quite good with TCP/IP to give me feedback on this. First of all, those devices use a subset of TCP/IP - and there's no rule that the full stack should be implemented. If you were a network administrator, would you allow such devices that are completely out of your control to share your network? Indeed, you could separate them into VLAN-s, but that's extra effort and taking any care of building automation devices shouldn't be a network administrators problem. Let's see what is inside. The preferred communication packet is UDP, which is connection-less and does not provide any means to be sure that your datagram has reached the destination. It's quite Ok for sending rapid stream of data where you don't really care if some of it gets lost (think VoIP or computer games). However, the authors of 14908 talk about quasi real time performance (something you want to have if you control the elevator speed and position, for example, or security doors) and data consistency! A quick reminder - TCP packet can achieve quasi real time performance and has means to ensure data consistency. TCP is an option, named primarily for device configuration. And to ensure that UDP packets could be sent over different routes and later be arranged in correct order, devices should append a sequence number onto each UDP packet, so that the receiver can sort them out. Again, something that TCP already has. This thing doesn't work trough NAT. If you have two buildings, they have to have clean route and they'd better not be using NAT, or there will be no CNP/IP communication between them. Since many modern computer networks use NAT these days, this could really be a big issue for network administrator. There's "CNP Wants All Broadcasts" flag that can be turned on and off. I can't think of a reason why a device should not listen to broadcasts, which are supposed to be something that carry some important information. Oh, there is one good reason - if you have cheap device with measly CPU and tiny RAM of 256bytes, you don't want to spend time computing broadcasts. But still, this looks like a bad design. And, while we're at it, the proposal talks about stale packet detection. In real time operations it is of vital importance to know when the packet you received is too old to reliably reflect the real situation - elevator speed and position data is irrelevant if it is more than one second or so old! Now, the paper states that devices can switch off that stale packet detection if it is not necessary - it looks like another compromise for devices with low computing power. Stale packet detection, by the way, please read this (taken from 14908): "A packet is considered to the stale if the time it takes for the packet to be transported from the sender to a receiver in an IP channel is greater than the channel timeout period (CTP). The CTP is a period of time that represents a reasonable upper bound on the time it takes a packet to go from a sender to a receiver on the IP channel. This European Standard does not cover how the channel timeout period is determined. Suffice it to say that it has units of milliseconds and is known by each of the CNP/IP devices in the IP channel." Suffice it to say that this is very, very vague definition and still trying to go trough fast-track. But, for the argument's sake, let's say that this is acceptable for a real time application - you do want to know how old is your packet, and we shouldn't be really grumpy about different devices having different thresholds for "this packet is too old", rather than defining one single value for all devices on the network. But, look at this: "In the case of CNP/IP to CNP/IP routers the packets are re-stamped with the most recent time before being forwarded onto the next IP channel." What this means is that, if you have four network segments: A->B->C->D, if your packet has to jump from A to B to C to D, at each hop it's being re-stamped with new time. Well, in case you really must know the exact time when that data entered the network (as is case with real time systems), by doing re-stamping you will never be sure that your data is not too old to be considered relevant. That is about time-stamping UDP packets, so you know the time. Everything is kept in check using NTP protocol, which is great for local segment, and even for distributed networks as long as all time servers are in sync, but: "Since the time server is the common basis for time on the network the device may continue forwarding packets only as long as it is reasonably sure that it is within the margin of error of the time server before it went off line." I'd say that in case of NTP server going down, entire building would stop in two hours after that, since all devices will be out of sync, and as the error accumulates, they will cease sending packets one by one. Lights out. Doors won't open. Security systems would shut down... pretty scary, isn't it? Here's another gem: "Newer or older versions of this data may be determined with a resolution of 1 second by looking at this field." - which means that resolution to distinguish what data is preceding what data has a resolution of one second, which is ages in real time operations. I think I didn't get that correct?!?! And, there are definitely things that must be clarified, because they are simply not true: "The UDP payload length of approximately 548 bytes causes restrictions in the membership of channels and causes other problems when nodes are considered." I'm sorry, but UDP can be up to 65507 bytes long, and header is just a tiny part of it... "ACK messages are not sent on TCP connections;" - huh?!?! "retransmissions are not performed on TCP connections;" - huh again? This probably is about CNP not having to send its own ACK-s, not stating that TCP doesn't do ACK's or retransmissions. I find this quite confusing. Regarding ACK-s: "ACKs are not retransmitted, but the state change in a device is always such that receipt of a duplicate previous message in the protocol causes no state change and causes retransmission of the appropriate ACK." - are they or are they not retransmitted? Then this: "segment packets are never used. Packets, regardless of size, are sent in their entirety"... this seem to be in collision with another sentence from that protocol: "Certain CNP/IP messages may cross a UDP datagram boundary. In such cases a special segmentation protocol is defined which can be used to facilitate this." It's quite conflicting. Or the reason might be that this is done just to implement a special segmentation protocol that does the same thing normal TCP/IP implementation does, but this one is special because there's tiny patent attached to it? You can do data bunching, as well. This means, if there's more room in an UDP packet that is for no apparent reason confined to 548 bytes, you can squeeze two or more of data payload segments into it. Just so that you know. Speaking of vague, there's interesting method how a device can check whether the other device support TCP: "The only way for the device to determine whether the configuration server supports TCP is to have a connection succeed. This mechanism assumes that a node that does not support TCP cannot respond with Connection Refused." Isn't that great? You could send an UDP packet asking other side whether they support TCP, but no - you have to wait for a freakin' timeout to find that out? And, what if there's a firewall or a smart router in between, the one that can silently drop packets? Who's going to debug that? Network administrator? Electrician? Wondering about DoS attacks? "If the number of simultaneous connections supported by a server is smaller than the number of devices on the channel, devices might receive connection refused messages from the server when trying to connect. Such responses should indicate that there are insufficient resources on the server. In this case they try again after a suitable amount of time, or elect to use UDP." If a device is too busy to answer, there's no point in switching to UDP packets, for two reasons: that device just can't answer packets so it is not going to answer those UDP packets, and since first device can't open a connection to the other device that can't know what is going on, the other device will assume that the first device can't talk TCP (see four paragraphs above) and happily send a stream of UDP datagrams to poor DoS-ed bastard, causing even more mayhem on the network. I like this one: "Security in CNP/IP devices is optional." And that one, too: "The level of security described in this sub-clause is authentication. Messages sent using the scheme described here are authenticated as coming from a trusted source. Information inside the messages is not encrypted and not hidden from inspection." We're talking about devices that might share your computer network, a place known for peaceful coexistence of many people who would never do anything bad... Like sniff the network, then craft the packets so your elevator goes trough the roof, or stops between the floors. Or start going up and down and up and down... There's more of that, but those are the most interesting parts. People I have talked with about this did not completely agree on what might be the possible reason for such vague definitions. The closest we've got is that the reason might be the cheap devices with slow CPU and just handful of bytes of RAM, so the protocol is tailored to allow them most efficient usage of network. That might be fine for them, but I certainly wouldn't like to have any of those on a computer network I am responsible for... There are some patents inside, you know, thirteen of them. Most expire in 2008., some expire in 2011. and 2012. A conspiracy theorist would say that somebody is trying to push that trough ISO as quickly as possible to get what is left of the royalties... So, what makes this ISO IEC DIS 14908 inappropriate candidate for fast-track process? 1. There already are standards and this one doesn't bring nothing really new 2. definitions are vague and there seem to be a lot of things that should be clarified BEFORE this could become an ISO standard. 3. There are patents inside; this does not disqualify a proposal, but it certainly is not an open standard, so should be considered sub-par to open standards that have no patent encumbrance. At the end of the day, it might be that I didn't understand the proposal and most if not all of my objections are incorrect (and I'm making fool of myself). However, what I do believe is that this paper is really vague and open to interpretation, not something that we would like to be a standard in a state like that. It's Ok if 14908 goes trough the full procedure of standardization, as that would weed out any vague or incorrect things. Going trough fast track, however, it just looks too similar to OOXML story - where largely unfinished standard rockets trough ISO because there's a company that might have some revenue lost if the proposal has to go trough tedious standard procedure. ISO really seem ready to fast track anything these days. We should watch more closely every fast track process, or else those people from ISO might one day go to their offices and end up being stuck in elevator going up and down and up and down... (update) Due to the popular demand, here's the clarification of "prominently German EIB": EIB is a protocol devised by EIBA, association of many organizations - and most of them are from Germany. No more, no less. Here's the list of patents in 14908 as well: U.S. Patent No. 4,918,690 Network and Intelligent Cell for Providing Sensing, Bi-Directional Communications and Control U.S. Patent No. 4,941,143 Protocol for Network Having a Plurality of Intelligent Cells U.S. Patent No. 4,955,018 Protocol for Network Having a Plurality of Intelligent Cells U.S. Patent No. 4,969,147 Network and Intelligent Cell for Providing Sensing, Bi-Directional Communications and Control U.S. Patent No. 5,182,746 Transceiver Interface U.S. Patent No. 5,297,143 Network Communication Protocol Including a Reliable Multi-Tasking Technique U.S. Patent No. 5,319,641 Multiaccess Carrier Sensing Network Communication Protocol with Priority Messages U.S. Patent No. 5,420,572 Configuration Device for Use in a Networked Communication System U.S. Patent No. 5,500,852 Method and Apparatus for Network Variable Aliasing U.S. Patent No. 5,513,324 Method and Apparatus Using Network Variables in a Multi-Node Network U.S. Patent No. 5,519,878 Method and Apparatus for Network Node Identification U.S. Patent No. 5,856,972 Duplicate Message Detection Method and Apparatus U.S. Patent No. 5,737,529 Networked Variables |