Hands on: Gephi, NetworkX, iGraph, Cytoscape Roberta Sinatra with the contribution of Pierre Deville, Marc Santolini and Michael Szell Northeastern University, October 1st 2014 About the problem 4. of the assignment (due on Monday): “The red and blue communities are said to be interactive if a typical red node is just two steps away from some blue node and vice versa. Evaluate the fraction of purple nodes required for the communities to become interactive.” Hint for the solution: Assume that a typical red node has k nearest neighbors. Quantify how many of the k neighbors are purple. Quantify how many blue neighbors a purple node has. Which is the condition for a typical red node of degree k to have at least a blue node as second neighbor? Young discipline, young tools Many tools out there, “The Tool” does not exist Choice of software, programming language, library, module, ..., depends on the your application, field, taste, problem See the following slides as a guidelines. If you know already well another programming language, tool, library, feel free to use it Gephi Tutorial inspired by the official Gephi tutorial on https://gephi.github.io Victor Hugo and “Les Misérables” Les Misérables Do not open the file in the window at the start of Gephi! Open the file (do one of the following): download the attachment from the email sent earlier today: lesmiserables.zip go to the course website: http://barabasilab.neu.edu/courses/phys5116/content/ lesmiserables.zip How the file looks like: nodes File: lesmiserables_csv/node_list.csv How the file looks like: nodes File: lesmiserables_csv/edge_list.csv How the GML file looks like: nodes File: lesmiserables.gml How the GML file looks like : edges File: lesmiserables.gml Importing the file in Gephi (.csv) Importing the file in Gephi (.csv) 1 2 Importing the file in Gephi (.csv) 1 2 Importing the file in Gephi (.csv) Importing the file in Gephi (.csv) Alternative: Importing the GML file in Gephi Alternative: Importing the GML file in Gephi Random node position Navigate with the mouse wheel Edge thickness Let’s make it pretty: Layout Force Atlas Force Atlas Fruchterman Reingold Ranking Nodes (color) Ranking Nodes (color) Ranking Nodes (color) Ranking Nodes (color) Ranking Nodes (color) Ranking Nodes (color) Ranking Nodes (color) Ranking Nodes (color) 2 1 Metrics Metrics Metrics Metrics Computes metrics related to shortest paths Ranking Nodes (size) Ranking Nodes (size) Ranking Nodes (size) Adjust Layout Show and adjust labels Show labels Proportional to node size Change fontsize Show and adjust labels Change Font Change color Show and adjust labels Change Font Change color Show and adjust labels Display other info Show and adjust labels Display other info Community Detection 1 2 3 Community Detection Right click to randomize colors Click to change the color Community Detection Filters We can hide nodes or edges Filters Export the visualization Export the visualization 1 2 Export the visualization Play with other settings Export the visualization Vectorial graphics Save the file Data Let’s remove the filter first Data Data To plot degree distributions, centralities, etc. in a better way Data Gephi Many plugins to make it better! About this network About this network Visualizing and measuring is only the start: always interpret your results! NetworkX: a Python module Inspired by the tutorial of Salvatore Scellato for the course “Social and Technological Network Analysis”, University of Cambridge (2011) Why python? NetworkX: online resources Getting started Import NetworkX >>> import networkx as nx There are different Graph classes for undirected and directed networks. Let’s create a basic Graph class >>> g = nx.Graph() # empty graph The graph g can be grown in several ways. NetworkX includes many graph generator functions and facilities to read and write graphs in many formats Getting started Import NetworkX >>> import networkx as nx There are different Graph classes for undirected and directed networks. Let’s create a basic Graph class >>> g = nx.Graph() # empty graph The graph g can be grown in several ways. NetworkX includes many graph generator functions and facilities to read and write graphs in many formats Getting started: add nodes # One node at a time >>> g.add_node(1) # method of nx.Graph # A list of nodes >>> g.add_nodes_from([2 ,3]) # A container of nodes >>> h = nx.path_graph(10) >>> g.add_nodes_from(h) # g now contains the nodes of h # In contrast, you can remove any node of the graph >>> g.remove_node(2) Getting started: add edges # Single edge >>> g.add_edge(1,2) >>> e=(2,3) >>> g.add_edge(*e) # unpack edge tuple # List of edges >>> g.add_edges_from([(1 ,2) ,(1 ,3)]) # Container of edges >>> g.add_edges_from(h.edges()) # In contrast, you can remove any edge of the graph >>> g.remove_edge(1,2) Getting started: access nodes and edges >>> g.add_edges_from([(1 ,2) ,(1 ,3)]) >>> g.add_node(‘a’) >>> >>> >>> [1, g.number_of_nodes() # also g.order() 4 g.number_of_edges() # also g.size() 2 g.nodes() 2, 3, ‘a’] >>> g.edges() [(1, 2), (1, 3)] >>> g.neighbors(1) [2, 3] >>> g.degree(1) 2 Getting started: node and edge iterators Many applications require iteration over nodes or over edges: simple and easy in NetworkX >>> g.add_edge(1,2) >>> for node in g.nodes(): print node, g.degree(node) 1, 1 2, 1 >>> g.add_edge(1,3,weight=2.5) >>> g.add_edge(1,2,weight=1.5) >>> for n1,n2,attr in g.edges(data=True): # unpacking print n1,n2,attr[‘weight’] 1, 2, 1.5 Getting started: graph operators Classic graph operations subgraph(G, nbunch) - induce subgraph of G on nodes in nbunch union(G1,G2) - graph union disjoint_union(G1,G2) - graph union assuming all nodes are different cartesian_product(G1,G2) - return Cartesian product graph compose(G1,G2) - combine graphs identifying nodes common to both complement(G) - graph complement create_empty_copy(G) - return an empty copy of the same graph class convert_to_undirected(G) - return an undirected representation of G convert_to_directed(G) - return a directed representation of G Getting started: graph generators # small famous graphs >>> petersen=nx.petersen_graph() >>> tutte=nx.tutte_graph() >>> maze=nx.sedgewick_maze_graph() >>> tet=nx.tetrahedral_graph() # classic graphs >>> K_5=nx.complete_graph(5) >>> K_3_5=nx.complete_bipartite_graph(3,5) >>> barbell=nx.barbell_graph(10,10) >>> lollipop=nx.lollipop_graph(10,20) # random graphs >>> er=nx.erdos_renyi_graph(100,0.15) >>> ws=nx.watts_strogatz_graph(30,3,0.1) >>> ba=nx.barabasi_albert_graph(100,5) >>> red=nx.random_lobster(100,0.9,0.9) Getting started: graph I/O NetworkX is able to read/write graphs from/to files using common graph formats: edge lists adjacency lists GML GEXF Python pickle GraphML Pajek LEDA YAML Getting started: read and write edge lists Read and write edge lists g = nx.read_edgelist(path,comments='#',create_using=No ne,delimiter='',nodetype=None,edgetype=None,encodi ng='utf-8') nx.write_edgelist(g,path,comments='#', delimiter=' ',encoding='utf-8') Formats • Node pairs with no data: 1 2 • Python dictionary as data: 1 2 {'weight':7, 'color':'green'} • Arbitrary data: 1 2 7 green Getting started: simple analysis lesmis= nx.read_weighted_edgelist('lesmiserables_edgelist.txt') lesmis_und = nx.Graph(lesmis) N,K = lesmis_und.order(), lesmis_und.size() avg_deg = float(2*K)/N >>> print "Nodes: ", N Nodes: 77 >>> print "Edges: ", K Edges: 254 >>> print "Average degree: ", avg_deg Average degree: 6.5974025974 Getting started: simple analysis >>> lesmis_GML=nx.read_gml('lesmiserables.gml') >>> lesmis_GML_und=nx.Graph(lesmis_GML) N_gml,K_gml = lesmis_GML_und.order(), lesmis_GML_und.size() avg_deg_gml = float(2*K_gml)/N_gml >>> print "Nodes: ", N_gml Nodes: 77 >>> print "Edges: ", K_gml Edges: 254 >>> print "Average degree: ", avg_deg_gml Average degree: 6.5974025974 Getting started: simple analysis >>> degrees = lesmis_und.degree() >>> degrees {u'30': 2, u'54': 4, u'42': 3, u'43': 3,... >>> degrees.values() [2, 4, 3, 3, 9, 11, 13, 12, 13, 12, 10, 1, # Clustering coefficient of all nodes (in a dictionary) >>> clust_coefficients = nx.clustering(lesmis_und) # Average clustering coefficient >>> ccs = nx.clustering(lesmis_und) >>> avg_clust = sum(ccs.values()) / len(ccs) >>> avg_clust 0.5731367499320134 Getting started: simple analysis # Betweenness centrality >>> bet_cen = nx.betweenness_centrality(lesmis_und) # Closeness centrality >>> clo_cen = nx.closeness_centrality(lesmis_und) # Eigenvector centrality >>> eig_cen = nx.eigenvector_centrality(lesmis_und) Getting started: simple analysis # simple network drawing >>> nx.draw(lesmis_und) Getting started: simple analysis >>> import numpy as np >>> import matplotlib.pyplot as plt >>> kmin=min(degrees) >>> kmax=max(degrees) >>> logBins = np.logspace(np.log10(kmin), >>> np.log10(kmax),num=20) >>> logBinDensity, binedges = np.histogram(degrees, bins=logBins, density=True) >>> logBins = np.delete(logBins, -1) >>> >>> >>> >>> fig = plt.figure() ax = fig.add_subplot(111) ax.set_xscale('log') ax.set_yscale('log') >>> plt.plot(logBins,logBinDensity,'x',color='blue') >>> plt.show() Getting started: simple analysis Getting started: simple analysis >>> plt.xlabel('k', fontsize=20) >>> plt.ylabel('P(k)', fontsize=30) Remember: compare your network always with a null model! >>> nx.connected_double_edge_swap(lesmis_und, 100) 64 Remember: compare your network always with a null model! Preserves the degree sequence, but all the rest might have changed (clustering coefficient, centralities, shortest path, etc.) iGraph Advantages Available for python, R, Ruby, C/C++ Faster than NetworkX in most cases (core written in C) Great to draw graphs with specific layouts Disadvantages Still some bugs (not for the simple routines) Less friendly than NetworkX Resources Gephi: https://gephi.github.io/ Python and much more: https://www.enthought.com/products/canopy/ Networkx: https://gephi.github.io/ iGraph: http://igraph.org/ Matplotlib: http://matplotlib.org Numpy: http://www.numpy.org/ One of your best friends: stackoverflow And remember: practice, practice, practice!
© Copyright 2024 ExpyDoc