python 2.7 - how to delete duplicates record from a mongodb database -


i have mongodb collection more 5 millions records. need delete duplicate entries. here code tried,

        pymongo import mongoclient         conn=mongoclient("mongodb://127.0.0.1:27017")         db=conn.test          cursor=db.coll.aggregate(             [                 {"$group": {"_id":{"instrument name":"$instrument name","high":"$high","low":"$low","v":"$v","date":"$date","close":"$close","open":"$open"}, "unique_ids": {"$addtoset": "$_id"}, "count": {"$sum": 1}}}              ],             {                 'allowdiskuse': 'true'             }          )           response = []         doc in cursor:             del doc["unique_ids"][0]             id in doc["unique_ids"]:                 response.append(id)          db.coll.remove({"_id": {"$in": response}}) 

but when try execute code getting error like,

traceback (most recent call last): file "delete_duplicate.py", line 12, in 'allowdiskuse': 'true' typeerror: aggregate() takes 2 arguments (3 given)

when run code in small data set without allowdiskuse deleting duplicate entries successfully.but when trying in large data set it's throwing error need use allowdiskuse if used geeting eror mentioned above.i using mongodb 3.0 version. ensureindex not work in platform.so please me out solve issue.

cursor = [{     "$group": {         "_id": {             "instrument name": "$instrument name",             "high": "$high",             "low": "$low",             "v": "$v",             "date": "$date",             "close": "$close",             "open": "$open"         },         "unique_ids": {             "$addtoset": "$_id"         },         "count": {             "$sum": 1         }     } }] 

then call

result = coll.aggregate(cursor, allowdiskuse=true) 

Comments

Popular posts from this blog

javascript - Laravel datatable invalid JSON response -

java - Exception in thread "main" org.springframework.context.ApplicationContextException: Unable to start embedded container; -

sql server 2008 - My Sql Code Get An Error Of Msg 245, Level 16, State 1, Line 1 Conversion failed when converting the varchar value '8:45 AM' to data type int -