[Bitcoin-development] Building a node crawler to map network

Rick Wesson rick at support-intelligence.com
Tue Sep 6 14:36:45 UTC 2011


I've got minna patches for nio based on bitcoinj. I've enumerated the
network a few times and am working on a DNS seed service as well as some
weather reports.

Happy to start a branch when the committers are ready.

-rick


On Tue, Sep 6, 2011 at 12:42 AM, Steve <shadders.del at gmail.com> wrote:

> Hi All,
>
> I started messing around today with building a node crawler to try and
> map out the bitcoin network and hopefully provide some useful
> statistics.  It's very basic so far using a mutilated bitcoinj to
> connect (due me being java developer and not having a clue with c/c++).
>  If it's worthwhile I'll hack bitcoinj some more to run on top Netty to
> take advantage of it's NIO architecture (netty's been shown to handle
> 1/2 million concurrent connections so would be ideal for the purpose).
>
> Hoping to a get a bit of input into what would be useful as well as
> strategy for getting max possible connections without distorted data.  I
> seem to recall Gavin talking about the need for some kind of network
> health monitoring so I assume there's a need for something like this...
>
> Firstly at the moment basically I'm just storing version message and the
> results of getaddr for each node that I can connect to.  Is there any
> other useful info that can be extracted from a node that's worth
> collecting?
>
> Second and main issue is how to connect.  From my first very basic
> probing it seems the very vast majority of nodes don't accept incoming
> connections no doubt due to lack of upnp.  So it seems the active crawl
> approach is not really ideal for the purpose.  Even if it was used the
> resultant data would be hopelessly distorted.
>
> A honeypot approach would probably be better if there was some way to
> make a node 'attractive' to other nodes to connect to.  That way it
> could capture non-listening nodes as well.  If there is some way to
> influence other nodes to connect to the crawler node that solves the
> problem.  If there isn't which I suspect is the case then perhaps
> another approach is to build an easy to deploy crawler node that many
> volunteers could run and that could then upload collected data to a
> central repository.
>
> While I'm asking questions I'll add one more regarding the getaddr
> message.  It seems most nodes return about 1000 addresses in response to
> this message.  Obviously most of these nodes haven't actually talked to
> all 1000 on the list so where does this list come from?  Is it mixture
> of addresses obtained from other nodes somehow sorted by timestamp?
> Does it include some nodes discovered by IRC/DNS? Or are those only used
> to find the first nodes to connect to?
>
> Thanks for any input... Hopefully I can build something that's useful
> for the network...
>
>
> ------------------------------------------------------------------------------
> Special Offer -- Download ArcSight Logger for FREE!
> Finally, a world-class log management solution at an even better
> price-free! And you'll get a free "Love Thy Logs" t-shirt when you
> download Logger. Secure your free ArcSight Logger TODAY!
> http://p.sf.net/sfu/arcsisghtdev2dev
> _______________________________________________
> Bitcoin-development mailing list
> Bitcoin-development at lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/bitcoin-development
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linuxfoundation.org/pipermail/bitcoin-dev/attachments/20110906/8c63140e/attachment.html>


More information about the bitcoin-dev mailing list