python - Efficiently compute columnwise sum of sparse array where every non-zero element is 1 -


i have bunch of data in scipy compressed sparse row (csr) format. of course majority of elements zero, , further know non-zero elements have value of 1. want compute sums on different subsets of rows of matrix. @ moment doing following:

import numpy np import scipy sp import scipy.sparse  # create data sparsely distributed ones data = np.random.choice((0, 1), size=(1000, 2000), p=(0.95, 0.05)) data = sp.sparse.csr_matrix(data, dtype='int8')  # generate column-wise sums on random subsets of rows nrand = 1000 k in range(nrand):     inds = np.random.choice(data.shape[0], size=100, replace=false)      # 60% of time spent here     extracted_rows = data[inds]      # 20% of time spent here     row_sum = extracted_rows.sum(axis=0) 

the last few lines there bottleneck in larger computational pipeline. annotated in code, 60% of time spent slicing data random indices, , 20% spent computing actual sum.

it seems me should able use knowledge data in array (i.e., non-zero value in sparse matrix 1; no other values present) compute these sums more efficiently. unfortunately, cannot figure out how. dealing data.indices perhaps? have tried other sparsity structures (e.g. csc matrix), converting dense array first, these approaches slower csr matrix approach.

it known indexing of sparse matrices relatively slow. , there have questions getting around accessing data attributes directly.

but first timings. using data , ind show get

in [23]: datad=data.a  # times @ 3.76 ms per loop  in [24]: timeit row_sumd=datad[inds].sum(axis=0) 1000 loops, best of 3: 529 µs per loop  in [25]: timeit row_sum=data[inds].sum(axis=0) 1000 loops, best of 3: 890 µs per loop  in [26]: timeit d=datad[inds] 10000 loops, best of 3: 55.9 µs per loop  in [27]: timeit d=data[inds] 1000 loops, best of 3: 617 µs per loop 

the sparse version slower dense one, not lot. sparse indexing slower, sum faster.

the sparse sum done matrix product

def sparse.spmatrix.sum      ....     return np.asmatrix(np.ones((1, m), dtype=res_dtype)) * self 

that suggests faster way - turn inds appropriate array of 1s , multiply.

in [49]: %%timeit    ....: b=np.zeros((1,data.shape[0]),'int8')    ....: b[:,inds]=1    ....: rowmul=b*data    ....:  1000 loops, best of 3: 587 µs per loop 

that makes sparse operation fast equivalent dense one. (but converting dense slower)

==================

the last time test missing np.asmatrix present in sparse sum. times similar, , results same

in [232]: timeit b=np.zeros((1,data.shape[0]),'int8'); b[:,inds]=1; x1=np.asmatrix(b)*data 1000 loops, best of 3: 661 µs per loop  in [233]: timeit b=np.zeros((1,data.shape[0]),'int8'); b[:,inds]=1; x2=b*data 1000 loops, best of 3: 605 µs per loop 

one produces matrix, other array. both doing matrix product, 2nd dim of b against 1st of data. though b array, task delegated data , matrix product - in not transparent way.

in [234]: x1 out[234]: matrix([[9, 9, 5, ..., 9, 5, 3]], dtype=int8)  in [235]: x2 out[235]: array([[9, 9, 5, ..., 9, 5, 3]], dtype=int8) 

b*data.a element multiplication , raises error; np.dot(b,data.a) works slower.

newer numpy/python has matmul operator. see same time pattern:

in [280]: timeit b@dataa           # dense product 100 loops, best of 3: 2.64 ms per loop  in [281]: timeit b@data.a           # slower due `.a` conversion 100 loops, best of 3: 6.44 ms per loop  in [282]: timeit b@data             # sparse product 1000 loops, best of 3: 571 µs per loop 

np.dot may delegate action sparse, though have careful. hung machine np.dot(csr_matrix(b),data.a).


Comments

Popular posts from this blog

javascript - Laravel datatable invalid JSON response -

java - Exception in thread "main" org.springframework.context.ApplicationContextException: Unable to start embedded container; -

sql server 2008 - My Sql Code Get An Error Of Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value '8:45 AM' to data type int -