AROW- Software Operation.
AROWBFTP is divided into Send and Receive scripts. AROWBFTP maintains the File Tree structure across the data diode and monitors when files have changed, but the detailed management of which files and data are made available is different for every deployment and will change over time so is considered a Network Administration task.
AROWBFTP is written in Python, allowing relatively simple, auditable and portable code. It has been successfully tested with Windows XP,7,8, MacOSX 10.3.9 and Linux 32 and 64-bit versions. It should also work on other operating systems that support Python, perhaps with a few minor modifications.
Compensating for data losses
Since the link is unidirectional, the transmitter does not receive any acknowledgment and it therefore cannot be warned in case of data losses on the receiving side. TCP is a connection oriented protocol and would normally re-try if data were not successfully received. AROW includes a substantial gated hardware buffer to support re-tries and prevent buffer saturation .
In the case of data loss, the only information available is that 1 or more frames were not successfully received. The AROWBftp programs include error detection and FIFO queues to decouple the packet transmission from the processing, and also include worker threads to background-process data without holding up transmission.
Data is transmitted with a header of fixed size that contains a unique marker sequence and frame length indicator. On reception, this information can be used to determine the integrity of the data, to re-align data from a continuous stream to a data packet form and to perform Cyclic Redundancy Checksums on the data. Together with file name and date stamping, substantial information is available to ensure the correct reconstruction of files streamed across AROW.
Data can be transmitted more than once , in case it gets lost. Inside AROWBFTP this redundancy is implemented at each file level ; the files are simply re-transmitted or with a limit of R tries .It is important to note that the use of redundancy is secure, but it is better to avoid the need in the general case. If a file frame must be retransmitted to be received completely, the received rate is divided by two or more.
The transmission flow should be limited to a value that does not permit loss of data on the first transmission, under the normal utilisation case of the receiving machine. The value of this limit depends therefore strongly on the machine quality, the operating system, the applications running, and the size of files transmitted. Too high a limit will result in buffer overruns( the data cannot be processed on the receive side quickly enough) giving data corruption and loss.
The state of the receive side buffer can be interrogated using the supplied contol_arow.py script. A typical value with a reasonable PC and a dedicated GBE link is of the order of 500-800 Mbits/s . (set flow limit -l 800000). For a 100Mb link, 50Mbits/s (set flow limit -l 50000).
Increase the IP stack Buffer size
This permits an empirical solution to reduce the risk of data loss and therefore employs an amount of ‘tuning’ that comes with experience and running time. It is however dependant on operating system, and must be configured outside of AROWBFTP.
Recovery from interruption
The synchronisation with file recovery mode (option -c) permits the stopping and resuming of tree synchronisation without having to retransmit all the data from the start. The previously transmitted files are not transmitted again. In case the high-side detects an incident that prevents good reception of data and the low-side continues the transmission of data, the recovery file must be corrected to take account of the elements not received. This setting can be applied while continuing transmission. The Python script xfk_reset.py provides a means of automatic correction. This script permits the changing of the status of files identified as from a particular date of transmission.
High-side Buffer overruns
The main buffer is 2Gbit deep. This is enough to store over 2 seconds worth of data at full gigabit Ethernet wire speed. If data gets congested on the high side server, packets may be dropped. This is expected TCP behaviour and is dealt with by resending the lost packet. While data is being resent, the buffer will continue being filled with data from the optical connection. When lost data is resent, data queued up in the buffer can resume being sent out. The buffer will gradually recover to normal levels depending on network slack. Gating the buffer prevents overruns by slowing the data received on the low side until there is sufficient space in the high-side buffer.
Websites are by design interactive and require the user to ask for data and information via their browser. Clearly this is not allowed or available from a network protected by AROW, so a different strategy must be employed. For websites with mostly static content, offline web cacheing tools are appropriate. Tools such as HTTrack, available for most operating systems, allow websites to be downloaded as a tree structure, creating files with all the appropriate links on a local storage system, thus creating a browseable offline cache.
These tools necessarily require some administration, to set up the levels of browsing, provide regular refreshes and so on, but when used with AROW enable users on the protected network to browse most of the functionality of a website, although of course no interaction that requires contacting the original site can take place.
AROWBftp includes the ability to connect to a TCP stream and mix-in TCP data with file data. AROWBftp acts as a server on each side of the data diode, allowing client streams to connect directly.
Direct Data Streaming
It is possible to stream data directly through AROW without using AROWBftp. See AROW Data manual for connection information. Direct (TCP) connection is supported on unprotected and protected sides of AROW. Simply connect as a client to each server socket. Data received on the unprotected low side will be made available on the protected high side a s soon as the high side client connects. The same principle can be used for virtually any audio/video source.
UDP streaming (unicast and multicast) is also directly supported ( single and dual versions) or can be accomplished using the Linux application ( HA version) , socat. Instructions are here AROWUDP
Filtering of files using ExeFilter, is now available .
For more detailed information Regarding the AROW’s operations, please view the AROW manual (Chapter 5)