ServerSelectionTimeoutError при вставке документов в DocumentDB
У меня есть bucket S3 с любым
- json духов одного бренда или
- папки одного бренда с духами в формате json.
Я знаю, как получить их индекс, но я хотел бы вставить эти объекты в мою базу данных documentdb в коллекциях, соответствующих их бренду.
import boto3
import pymongo
import sys
def iterate_bucket_items(bucket):
"""
Generator that iterates over all objects in a given s3 bucket
See http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.list_objects_v2
for return data format
:param bucket: name of s3 bucket
:return: dict of metadata for an object
"""
client = boto3.client('s3')
paginator = client.get_paginator('list_objects_v2')
page_iterator = paginator.paginate(Bucket=bucket)
for page in page_iterator:
if page['KeyCount'] > 0:
for item in page['Contents']:
yield item
##Create a MongoDB client, open a connection to Amazon DocumentDB as a replica set and specify the read preference as secondary preferred
client = pymongo.MongoClient('mongodb://user:[email protected]:27017/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0&readPreference=secondaryPreferred&retryWrites=false')
##Specify the database to be used
db = client.perfumes
c = 0
for i in iterate_bucket_items(bucket='datahubpredicity'):
keyName = i['Key']
print(keyName)
if '/' in keyName and keyName[-1] is not '/':
print("keyName: ", keyName)
folder, file = keyName.split('/')
##Specify the collection to be used
col = db[folder]
content_object = s3.Object('datahubpredicity', keyName)
file_content = content_object.get()['Body'].read().decode('utf-8')
json_content = json.loads(file_content)
print(json_content)
##Insert a single document
col.insert_one(json_content)
c+=1
if c >= 6:
break
# ##Print the result to the screen
# print(x)
##Close the connection
client.close()
Но он возвращает:
pymongo.errors.ServerSelectionTimeoutError:
datahub.cluster-1.eu-west-3.docdb.amazonaws.com:27017:
timed out, Timeout: 30s,
Topology Description: <TopologyDescription id: 6254472217824b192df5665d,
topology_type: ReplicaSetNoPrimary,
servers: [<ServerDescription ('datahub.cluster-1.eu-west-3.docdb.amazonaws.com', 27017) server_type: Unknown,
rtt: None, error=NetworkTimeout('datahub.cluster-1.eu-west-3.docdb.amazonaws.com:27017: timed out')>]>