Data

Free Open Data from Related-Work.net

On this website we offer the data used on related-work.net for download. It contains the following data sets:

License

We offer our own work under a Open Data Commons Attribution License (ODC-By) which means you are free to use, share, and adapt this data as you attribute the source by linking to this website. Our data sources (Arxiv.org/Meta harvesting,Arxiv.org/Bulk download) impose further restrictions, including, that you link back to arxiv.org for downloads.

Disclaimer

The format of the data presented below is likely to change, as we revise our data structures in the future. All available data has been created using sources from the arxiv.org. We do not garantee the validity or completeness of the data.

Arxiv citation graph

We downloaded the source files of all arxiv articles published until 2012-09-31, extracted the references and matched them against the metadata using these python scripts. The result is a 2.0Gb sized *.txt file with more than 16m lines representing the citaiton graph in the following format:
{source-id}|{reference string as found in tex sources}|{target-id if found}

Examples

1008.4729|M. Johnson, K. Zumbrun, and P. Noble, Nonlinear stability of viscous roll waves, preprint (2010).|1002.0788
1002.2065|K. Binder. J. Non-crystalline Solids , 307:1--8, 2002.|
astro-ph/0006446|D. Boyanovsky and H. J. de Vega, Phys. Rev. D61 , 105014 (2000).|
0711.3015|Coldea, R., Tennant, D. A. Tylczynski, Z. Extended scattering continua [...]. Phys. Rev. B / 68 , 134424 (2003).|cond-mat/0307025

Downloads

Arxiv metadata

The arxiv offers the metadata for all articles for download using an Open Archives Interface API. We downloaded all available data until 2012-09-31 and stored using these scripts in the JSON format.

Example

[  "0704.0204",
   {"publisher": [], 
    "description": ["  We present a theory of transport through interacting [.....] 
                       A $\\pi$-transition of the supercurrent can\nbe driven by 
                       tuning gate or bias voltages.\n", 
                    "Comment: 11 pages, 4 figures"],
    "language": [], 
    "rights": [], 
    "format": [], 
    "contributor": [], 
    "source": [], 
    "creator": ["Pala, Marco G.", "Governale, Michele", "K\u00f6nig, J\u00fcrgen"], 
    "relation": [], 
    "coverage": [], 
    "date": ["2007-04-02", "2007-08-29"], 
    "title": ["Non-Equilibrium Josephson and Andreev Current through Interacting\n  Quantum Dots"], 
    "identifier": ["http://arxiv.org/abs/0704.0204", "New J. Phys. 9 (2007) 278", "doi:10.1088/1367-2630/9/8/278"], 
    "type": ["text"], 
    "subject": ["Condensed Matter - Superconductivity", "Condensed Matter - Mesoscale and Nanoscale Physics"]}
]

Downloads

Arxiv citation and author graph as Neo4J database

The above information is filled into a neo4j (v 1.7.2) graph database using these python script. The basic structure of the graph db is as follows:
  • We have nodes for every paper storing basic metadata.
  • We have reference relations between papers.
  • We have nodes for every mentioned author name and an author relation for each of his papers.
Here is my attempt of visualizing the situation.
    Paper1    Paper2
      |                    |
      |[author]            |[author]
      v                    v
    Author1              Author2
For more information, please refer to the documentation.

Downloads

One thought on “Data

  1. Pingback: Get the full neo4j power by using the Core Java API for traversing your Graph data base instead of Cypher Query Language

Leave a Reply

Your email address will not be published. Required fields are marked *

*


*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>