csr_matrices

I have a large csr_matrix in npz format.  I'd like to use that as input as is, but it doens't have IDs field

added this to graph.py (but it doesn't work)

```python
if 'IDs' in raw:
    self.set_node_ids(raw["IDs"].tolist())
else:
    # added by kwc                                                                                                                                                                                                                          
    self.set_node_ids(np.arange(raw["shape"][0]).tolist())
```

Created edg2npz.py with this:

```python
import numpy as np
import scipy.sparse
import sys

dtype=bool
if sys.argv[2] == "int":
    dtype=int

X=[]
Y=[]

for line in sys.stdin:
    fields = line.rstrip().split()
    if len(fields) >= 2:
	x,y = fields[0:2]
	X.append(int(x))
        Y.append(int(y))

X = np.array(X, dtype=np.int32)
Y = np.array(Y, dtype=np.int32)
N = 1+max(np.max(X), np.max(Y))
V = np.ones(len(X), dtype=bool)

M = scipy.sparse.csr_matrix((V, (X, Y)), dtype=dtype, shape=(N,N))

scipy.sparse.save_npz(sys.argv[1], M)
```

called it with 
```bash
python edg2npz.py demo/karate.bool.npz bool < demo/karate.edg 
```

Unfortunately, I can't use this kind of csr_matrix...

I can write out my matrix to text and then run pecanpy on that, but my matrix is very large and it will take a long time to write it out and read it back.  My matrix has N = 300M nodes and E=2B nonzero edges.

```txt
 pecanpy --input demo/karate.bool.npz --output demo/karate.int.emb --mode SparseOTF
init pecanpy: p = 1, q = 1, workers = 1, verbose = False, extend = False, gamma = 0, random_state = None
WARNING: when p = 1 and q = 1 with unweighted graph, highly recommend using the FirstOrderUnweighted over SparseOTF. The runtime could be improved greatly with improved  memory usage.
Took 00:00:00.02 to load Graph
Took 00:00:00.00 to pre-compute transition probabilities
Traceback (most recent call last):
  File "/home/k.church/venv/gft/bin/pecanpy", line 8, in <module>
    sys.exit(main())
  File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/cli.py", line 333, in main
    walks = simulate_walks(args, g)
  File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/wrappers.py", line 18, in wrapper
    result = func(*args, **kwargs)
  File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/cli.py", line 320, in simulate_walks
    return g.simulate_walks(args.num_walks, args.walk_length)
  File "/home/k.church/venv/gft/lib/python3.8/site-packages/pecanpy/pecanpy.py", line 153, in simulate_walks
    walk_idx_mat = self._random_walks(
  File "/home/k.church/venv/gft/lib/python3.8/site-packages/numba/core/dispatcher.py", line 468, in _compile_for_args
    error_rewrite(e, 'typing')
  File "/home/k.church/venv/gft/lib/python3.8/site-packages/numba/core/dispatcher.py", line 409, in error_rewrite
    raise e.with_traceback(None)
numba.core.errors.TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Failed in nopython mode pipeline (step: nopython frontend)                                                                                                                                                                                          
Failed in nopython mode pipeline (step: nopython frontend)                                                                                                                                                                                          
No implementation of function Function(<built-in function itruediv>) found for signature:                                                                                                                                                           
                                                                                                                                                                                                                                                    
 >>> itruediv(array(bool, 1d, C), Literal[int](1))                                                                                                                                                                                                  
```
                                                                                                                                                                                                                                                    
There are 6 candidate implementations:                                                                                                                                                                                                              
   - Of which 2 did not match due to:                                                                                                                                                                                                               
   Overload in function 'NumpyRulesInplaceArrayOperator.generic': File: numba/core/typing/npydecl.py: Line 244.                                                                                                                                     
     With argument(s): '(array(bool, 1d, C), int64)':
    Rejected as the implementation raised a specific error:                                                                                                                                                                                         
      AttributeError: 'NoneType' object has no attribute 'args'
  raised from /home/k.church/venv/gft/lib/python3.8/site-packages/numba/core/typing/npydecl.py:255
   - Of which 2 did not match due to:                                                                                                                                                                                                               
   Operator Overload in function 'itruediv': File: unknown: Line unknown.                                                                                                                                                                           
     With argument(s): '(array(bool, 1d, C), int64)':
    No match for registered cases:                                                                                                                                                                                                                  
     * (int64, int64) -> float64                                                                                                                                                                                                                    
     * (int64, uint64) -> float64                                                                                                                                                                                                                   
     * (uint64, int64) -> float64                                                                                                                                                                                                                   
     * (uint64, uint64) -> float64                                                                                                                                                                                                                  
     * (float32, float32) -> float32                                                                                                                                                                                                                
     * (float64, float64) -> float64                                                                                                                                                                                                                
     * (complex64, complex64) -> complex64                                                                                                                                                                                                          
     * (complex128, complex128) -> complex128
   - Of which 2 did not match due to:                                                                                                                                                                                                               
   Overload of function 'itruediv': File: numba/core/typing/npdatetime.py: Line 94.                                                                                                                                                                 
     With argument(s): '(array(bool, 1d, C), int64)':
    No match.





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

csr_matrices #122

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

csr_matrices #122

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions