If you plan to pay for traffic analysis and alerting system based on NetFlow, the most important question you should ask your vendor is related to the performance of a NetFlow solution. It is determined by the flows-per-second value. The parameter represents a concurrent number of flows hitting a NetFlow collector. In other words, flows-per-second reflects the power and capability of your flow collector to capture and process flow data close to the real-time. Here is simple math. The flow represents a unique connection (session) and a typical flow volume is about 100 flows/minute per computer, with a peak of 400. At first glance, it might seem too much, however, applications generate lots of new connections. For instance, a web page may load images from clouds so they can be delivered faster, banners and adds. Instant messaging, OS and Antivirus software checking for updates is another concern. DNS traffic counts as well, significantly generating a large number of small flows.
It is important to say that the collection rate may be affected by the flow variance. Therefore, your NetFlow vendor should specify if the proclaimed flow-per-second parameter reflects a burst or a sustained collection rate. For example, your Internet connection may be fully utilized by few users downloading or watching movies in HD quality. In this case, the flow collection rate and NetFlow network overhead are small as only a few flows are generated. This is in contrast to a DDoS attack launched against an enterprise where many small flows are generated with a serious impact on your network infrastructure. The flow collector should be able to process them all and very likely a NetFlow sampling is required to bring down the number of exported packets and reduce depletion of the collector resources (CPU, RAM, disk space). Obviously, flow volume is not directly related to the bandwidth consumed.
Note: A good reporting performance depends on the overall performance of the disk. Although, several attributes contribute to the disk performance, such as seek time or latency, disks with 10,000 RPM (and higher) are recommended. |
The design of the NetFlow solution depends on the size of your network infrastructure that contributes to network traffic. Let’s say that an enterprise has 20.000 computers and every node causes an average value of 200 flows per minute. It equals to 4000.0000 flows per minute of 66.666 flows per second. If you prefer the software to hardware NetFlow collector, it should scale over 66,000 flows per second. If the performance is a concern, your vendor probably needs to deploy hardware based collector that is able to scale over 100.000 collection rates. In the case of higher flow volumes, the solution should be distributed, so the collection rates can go even higher. In this case, multiple collectors run on separate and geographically distributed servers and collectors synchronize their data to a centralized analyzer server. When a network connection to a centralized server fails, the collector must be able to hold data until a connection is reestablished.
Ideally, unsegregated or raw data should be stored for a long time period in raw data storage so every flow can be checked back in time if needed. Although this approach offers the highest data granularity, it is not feasible due to limited resources. The time period or the number of the flow records data stored in raw storage should be configurable. Ideally, the NetFlow analyzer should report the suggested value based on the free disk space. To save disk space and keep the optimal performance of NetFlow analyzers, raw data are typically automatically summarized based on the aggregation mechanism at a certain time interval, e.g. 1-minute, 15-minutes, 60-minutes, etc. Aggregated data are stored in the aggregation storage and used for historical reports, capacity planning, and trend analysis. Obviously, they do not offer the same level of granularity as the raw data. Ask your vendor about aggregation mechanism (e.g. top N records).
One could ask how much bandwidth is consumed by a flow export. Well, it depends on several factors such as whether sampling is used or the export of additional fields is required (BGP-related information, MAC, VLAN IDs), VRFs, MPLS etc.). The universal formula is that NetFlow export traffic represents about 1.5% to 3% of the total traffic seen by an exporter. For instance, if the single Ethernet frame (1500 Bytes) carries about 24-30 flows and a flow exporter is exporting 3000 flows-per-second to the collector, it is about 1.5 Mbps (1500 / 24 * 3000 * 8).
Every modern network equipment can export NetFlow v5, v9 or IPFIX flow records. As the IPFIX is the only official industry NetFlow standard (RFC7012) based on Cisco NetFlow v9, NetFlow collector should also support third-party implementations, including J-Flow (Juniper) and sampled flow (sFlow).
Unfortunately, most solutions nowadays offer different versions and licensing models depending on the size of your network infrastructure (e.g number of exporters or collectors, ports, collection rates) and, where appropriate, the purpose of using flow records (capacity planning versus forensic investigations). It is hard to predict and/or estimate the full price one would pay for such products as the network evolves.
Ideally, a NetFlow vendor should be able to provide flat clear pricing as well as a long enough trial period of their product so you can evaluate whether the solution meets your needs. This would ensure that the NetFlow solution has the required feature sets and meets the performance requirements. You can also test the compatibility with your network infrastructure, easiness of integration, user interface, and maintenance.