Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
Does anyone know of a programme that can take a wireshark (pcap) trace and turn it into a visual network topology?
I have 3 pcap files with "ALOT" of data and I really want to see if I can make sense of some things.
I played with things like network miner but nothing that can give a visual cue to the data. For instance
You are in fact asking two questions:
How to discover the network topology from network traces
How to visualize the discovered topology
Topology Discovery
This is the hard part. The community has not yet have developed reliable tools, because network traffic exhibits so much hard-to-deal with crud. The most useful tool that comes to mind in this space is Bro, which creatse quality connection logs.
It is straight-forward to extract communication graphs, i.e., graphs that show who communicates with whom. By weighing the edges with some metric (number of packets/bytes/connections), you can get an idea about the relative contribution of a given node.
For more sophisticated analyses, you will have to develop some heuristics. For example, detecting routers may involve looking at packet forwarding behavior or extracting default gateways from DHCP ACK messages. Bro ("the Python for the network") allows you to codify such analysis in a very natural form.
Graph Visualization
The low-key approach involves generating GraphViz output. Afterglow offers some wrapping that makes the output more digestible. For inspiration, checkout out http://secviz.org/ where you find many examples on such graphs. Most of them have been created with afterglow.
There is also Gephi, a more fancy graph visualization engine, which supports a variety of graph input formats. The generated graphs look quite fancy and can also be explored interactively.
Related
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I'm looking for options for graph database to be used in a project. I expect to have ~100000 writes (vertix + edge) per day. And much less reads (several times per hour). The most frequent query takes 2 edges depth tracing that I expect to return ~10-20 result nodes.
I don't have experience with graph databases and want to work with gremlin to be able to switch to another graph database if needed. Now I consider 2 possibilities: neo4j and Titan.
As I can see there is enough community, information and tools for Neo4j, so I'd prefer to start from it. Their capacity numbers should be enough for our needs (∼ 34 billion nodes, ∼ 34 billion edges). But I'm not sure which hardware requirements will I face in this case. Also I didn't see any parallelisation options for their queries.
On the other hand Titan is built for horizontal scalability and has integrations with intensively parallel tools like spark. So I can expect that hardware requirements can scale in a linear way. But there is much less information/community/tools for Titan.
I'll be glad to hear your suggestions
Sebastian Good made a wonderful presentation comparing several databases to each other. You might have a look at his results in here.
A quick summary of the presentation is here
For benchmarks on each graph databases with different datasets, different node sizes and caches, please have a look at this Github repository by socialsensor. Just to let you know, the results in the repo are a bit different that the ones in the presentation.
My personal recommendation is:
If you have deep pockets, go for Neo4j. With the technical support and easy CIPHER, things will go pretty quickly.
If you support Open Source (and are patient for its development cycles), go for Titan DB with Amazon Dynamo DB backend. This will give you "infinite" scalability and good performance with both EC2 machines and Dynamo tables. Check here for docs and here for their code for more information.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
They both are open source distributed time series databases, OpenTSDB for metrics, InfluxDB for metrics and events with no external dependencies, on the other OpenTSDB based on HBase.
Any other comparation between them?
And if I want to store and query|analyze metrics real-time with no deterioration loss based on time series, which would be better?
At one of the conferences I've heard people running something like Graphite/OpenTSDB for collecting metrics centrally and InfluxDB locally on each server to collect metrics only for this server. (InfluxDB was chosen for local storage as it is easy to deploy and lightweight on memory).
This is not directly related to your question but the idea appealed to me much so I wanted to share it.
Warp 10 is another option worth considering (I'm part of the team building it), check it out at http://www.warp10.io/.
It is based on HBase but also has a standalone version which will work fine for volumes in the low 100s billions of datapoints, so it should fit most use cases out there.
Among the strengths of Warp 10 is the WarpScript language which is built from the ground up for manipulating (Geo) Time Series.
Yet another open-source option is blueflood: http://blueflood.io.
Disclaimer: like Paul Dix, I'm biased by the fact that I work on Blueflood.
Based on your short list of requirements, I'd say Blueflood is a good fit. Perhaps if you can specify the size of your dataset, the type of analysis you need to run or any other requirements that you think make your project unique, we could help steer you towards a more precise answer. Without knowing more about what you want to do, it's going to be hard for us to answer more meaningfully.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 4 years ago.
Improve this question
I have a large network of routers all interconnected in a community network. I am trying to see different ways in which i could analyse this network and gain helpful insights and ways in which it could be improved just by analyzing the graph(using gephi). So i came across this measure called "Modularity" which is defined as :
to measure the strength of division of a network into modules (also called groups, clusters or communities). Networks with high modularity have dense connections between the nodes within modules but sparse connections between nodes in different modules.
My question is, what can i learn by from the network by using the "Modularity" measure ? When i use it in gephi for example, the network is colored per segments but how could it be helpful ?
Modularity algorithm implemented in Gephi looks for the nodes that are more densely connected together than to the rest of the network (it's well explained in the paper they published on the website by the guy who created the algorithm - Google scholar it - Blondel, V. D., Guillaume, J., & Lefebvre, E. (n.d.). Fast unfolding of communities in large networks.)
So then when you implement this measure the colors indicate different communities determined by this algorithm and basically it'll show, in your case, which routers are more densely connected between each other than to the rest of the network.
To make this information really helpful, though, you have to juxtapose it at with at least one more measure. For instance, if you apply Betweenness Centrality measure (which shows which routers are connect the most different communities together or the most influential nodes in the network that serve as the junctions), you'd be able to identify the most vulnerable routers in every community, which should be monitored more closely. You could also filter out a community and identify the most connected routers within each community (highest degree measure), which would then show you which routers are important for that specific community.
All in all, modularity measure allows you to see vulnerable spots of your network and gives you a general idea about its structure.
There is also interesting research on modularity as the measure of network's robustness. For example, if your network has modularity that is too high, it's more robust against random external attacks, but it's also susceptible to targeted attacks on the most connected hubs (high betweenness centrality nodes). On the other hand, if it's too interconnected, you could put it down more easily if you wage a large scale attack on the routers (or if there's a blackout, for example). There's some good explanation of this in the paper (or video / slide show) on information epidemics here and a more general explanation of metastability vs modularity measure here.
Hope this helps, and let me know if you have more questions, I love this subject!
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 9 years ago.
Improve this question
so ive read this post here: and im a bit confused, so here's my additional question:
how does UDP handle data which are lost? because it says there that UDP doesn't care if the data will arrive to the destination or the ordering of data.
SO example in an on line game, how does UDP handle data loss? my understanding is that when a data is lost so lets say my character's hand cant be seen in an on line game? or the warglaive of azinoth which i farmed for years is lost? LOL
and if the ordering is not important so lets say i the head of my character will be messed up or the body will be dislocated, something like that.
So can anyone clarify this for me?
A better example of UDP in gaming is for position information.
In this case the position of a character is sent several times a second and it doesn't matter whether a packet is lost as another one will be sent again shortly. When you combine this with a game engine that does some level of interpolation and extrapolation then you can achieve smooth looking motion from sporadic data.
In the case of re-ordering if you receive a character position which is older (in time) due to packet re-ordering you would just discard it as you've already got a newer value.
With your example of loot - it's possible this information is sent via TCP or using a reliable wrapper around UDP - to your client.
Take a look at http://code.google.com/p/lidgren-network-gen3/ for a C# UDP networking library which is designed with games in mind. It provides various unreliable and reliable channels over UDP and should give you an idea how UDP is typically used in games.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions asking us to recommend or find a book, tool, software library, tutorial or other off-site resource are off-topic for Stack Overflow as they tend to attract opinionated answers and spam. Instead, describe the problem and what has been done so far to solve it.
Closed 8 years ago.
Improve this question
I am working on getting the performance parameters of a tcp connection and one these parameters is the bandwidth. I am intending to use the tcp_info structure supported from linux 2.6 onwards, which holds the meta data about a tcp connection. The information can be retrieved using the getsockopt() function call on tcp_info. I have spent lot of time finding a good documentation which explains all the parameters in that structure, but couldn't find one.
Also I tested a small program to retrieve the values from tcp_info for a tcp connection where I found the measured MSS values for most of the time as zero.To make long story short-Is there a link to follow for which has complete details ontcp_info and also is it reliable to use these values.
Here is a fairly comprehensive write-up of the structure and use of the linux tcp_info by René Pfeiffer but there are a couple of things worth noting:
The author needed to look at these data repeated over time because there are no aggregate stats in that structure.
The author directs you to the tcp.c source as the final authority on the meaning of any of those data.
I'm not sure what you were hoping to get from the Maximum Segment Size, but expect you thought it meant something else.
If you are truly interested in exact measurements of bandwidth you need to use a measurement device which is outside the system being tested as even pulling the ioctls will affect the phenomenon you are interested in knowing about. A passive wire sniffer is the only way to get truly accurate results. Finally, depending on your application, "bandwidth" is a really broad umbrella which flattens many measurements (e.g. latency, round-trip-time, variability, jitter, etc.) into one category.