I have a large csr_matrix in npz format. I'd like to use that as input as is, but it doens't have IDs field
added this to graph.py (but it doesn't work)
if 'IDs' in raw:
self.set_node_ids(raw["IDs"].tolist())
else:
# added by kwc
self.set_node_ids(np.arange(raw["shape"][0]).tolist())
Created edg2npz.py with this:
import numpy as np
import scipy.sparse
import sys
dtype=bool
if sys.argv[2] == "int":
dtype=int
X=[]
Y=[]
for line in sys.stdin:
fields = line.rstrip().split()
if len(fields) >= 2:
x,y = fields[0:2]
X.append(int(x))
Y.append(int(y))
X = np.array(X, dtype=np.int32)
Y = np.array(Y, dtype=np.int32)
N = 1+max(np.max(X), np.max(Y))
V = np.ones(len(X), dtype=bool)
M = scipy.sparse.csr_matrix((V, (X, Y)), dtype=dtype, shape=(N,N))
scipy.sparse.save_npz(sys.argv[1], M)
called it with
python edg2npz.py demo/karate.bool.npz bool < demo/karate.edg
Unfortunately, I can't use this kind of csr_matrix...
I can write out my matrix to text and then run pecanpy on that, but my matrix is very large and it will take a long time to write it out and read it back. My matrix has N = 300M nodes and E=2B nonzero edges.
pecanpy --input demo/karate.bool.npz --output demo/karate.int.emb --mode SparseOTF
init pecanpy: p = 1, q = 1, workers = 1, verbose = False, extend = False, gamma = 0, random_state = None
WARNING: when p = 1 and q = 1 with unweighted graph, highly recommend using the FirstOrderUnweighted over SparseOTF. The runtime could be improved greatly with improved memory usage.
Took 00:00:00.02 to load Graph
Took 00:00:00.00 to pre-compute transition probabilities
Traceback (most recent call last):
File "/home/k.church/venv/gft/bin/pecanpy", line 8, in <module>
sys.exit(main())
File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/cli.py", line 333, in main
walks = simulate_walks(args, g)
File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/wrappers.py", line 18, in wrapper
result = func(*args, **kwargs)
File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/cli.py", line 320, in simulate_walks
return g.simulate_walks(args.num_walks, args.walk_length)
File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/pecanpy.py", line 153, in simulate_walks
walk_idx_mat = self._random_walks(
File "/home/k.church/venv/gft/lib/python3.8/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args
error_rewrite(e, 'typing')
File "/home/k.church/venv/gft/lib/python3.8/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)
No implementation of function Function(<built-in function itruediv>) found for signature:
>>> itruediv(array(bool, 1d, C), Literal[int](1))
There are 6 candidate implementations:
- Of which 2 did not match due to:
Overload in function 'NumpyRulesInplaceArrayOperator.generic': File: numba/core/typing/npydecl.py: Line 244.
With argument(s): '(array(bool, 1d, C), int64)':
Rejected as the implementation raised a specific error:
AttributeError: 'NoneType' object has no attribute 'args'
raised from /home/k.church/venv/gft/lib/python3.8/site-packages/numba/core/typing/npydecl.py:255
- Of which 2 did not match due to:
Operator Overload in function 'itruediv': File: unknown: Line unknown.
With argument(s): '(array(bool, 1d, C), int64)':
No match for registered cases:
- (int64, int64) -> float64
- (int64, uint64) -> float64
- (uint64, int64) -> float64
- (uint64, uint64) -> float64
- (float32, float32) -> float32
- (float64, float64) -> float64
- (complex64, complex64) -> complex64
- (complex128, complex128) -> complex128
- Of which 2 did not match due to:
Overload of function 'itruediv': File: numba/core/typing/npdatetime.py: Line 94.
With argument(s): '(array(bool, 1d, C), int64)':
No match.
I have a large csr_matrix in npz format. I'd like to use that as input as is, but it doens't have IDs field
added this to graph.py (but it doesn't work)
Created edg2npz.py with this:
called it with
python edg2npz.py demo/karate.bool.npz bool < demo/karate.edgUnfortunately, I can't use this kind of csr_matrix...
I can write out my matrix to text and then run pecanpy on that, but my matrix is very large and it will take a long time to write it out and read it back. My matrix has N = 300M nodes and E=2B nonzero edges.
pecanpy --input demo/karate.bool.npz --output demo/karate.int.emb --mode SparseOTF init pecanpy: p = 1, q = 1, workers = 1, verbose = False, extend = False, gamma = 0, random_state = None WARNING: when p = 1 and q = 1 with unweighted graph, highly recommend using the FirstOrderUnweighted over SparseOTF. The runtime could be improved greatly with improved memory usage. Took 00:00:00.02 to load Graph Took 00:00:00.00 to pre-compute transition probabilities Traceback (most recent call last): File "/home/k.church/venv/gft/bin/pecanpy", line 8, in <module> sys.exit(main()) File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/cli.py", line 333, in main walks = simulate_walks(args, g) File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/wrappers.py", line 18, in wrapper result = func(*args, **kwargs) File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/cli.py", line 320, in simulate_walks return g.simulate_walks(args.num_walks, args.walk_length) File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/pecanpy.py", line 153, in simulate_walks walk_idx_mat = self._random_walks( File "/home/k.church/venv/gft/lib/python3.8/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args error_rewrite(e, 'typing') File "/home/k.church/venv/gft/lib/python3.8/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite raise e.with_traceback(None) numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend) Failed in nopython mode pipeline (step: nopython frontend) Failed in nopython mode pipeline (step: nopython frontend) No implementation of function Function(<built-in function itruediv>) found for signature: >>> itruediv(array(bool, 1d, C), Literal[int](1))There are 6 candidate implementations:
Overload in function 'NumpyRulesInplaceArrayOperator.generic': File: numba/core/typing/npydecl.py: Line 244.
With argument(s): '(array(bool, 1d, C), int64)':
Rejected as the implementation raised a specific error:
AttributeError: 'NoneType' object has no attribute 'args'
raised from /home/k.church/venv/gft/lib/python3.8/site-packages/numba/core/typing/npydecl.py:255
Operator Overload in function 'itruediv': File: unknown: Line unknown.
With argument(s): '(array(bool, 1d, C), int64)':
No match for registered cases:
Overload of function 'itruediv': File: numba/core/typing/npdatetime.py: Line 94.
With argument(s): '(array(bool, 1d, C), int64)':
No match.