Skip to content

Commit ebebb43

Browse files
Revise README steps and add helpful links section (documentdb#287)
1 parent cc9abd0 commit ebebb43

1 file changed

Lines changed: 5 additions & 273 deletions

File tree

README.md

Lines changed: 5 additions & 273 deletions
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,7 @@ singleDocumentReadResult = quickStartCollection.find_one({'name': 'John Doe'})
142142

143143
```
144144

145-
Step 8: Run aggregation pipeline query
145+
Step 8: Run aggregation pipeline operation
146146

147147
```python
148148

@@ -162,277 +162,9 @@ for eachDocument in results:
162162

163163
```
164164

165-
## To interact directly with the PostgreSQL layer
165+
### Helpful Links
166166

167-
### Pre-requisite
168-
169-
- Ensure [Docker](https://docs.docker.com/engine/install/) is installed on your system.
170-
171-
### Building DocumentDB with Docker
172-
173-
Step 1: Clone the DocumentDB repo.
174-
175-
```bash
176-
git clone https://github.com/microsoft/documentdb.git
177-
```
178-
179-
Step 2: Create the docker image. Navigate to cloned repo.
180-
181-
```bash
182-
docker build . -f .devcontainer/Dockerfile -t documentdb
183-
```
184-
185-
Note: Validate using `docker image ls`
186-
187-
Step 3: Run the Image as a container
188-
189-
```bash
190-
docker run -v $(pwd):/home/documentdb/code -it documentdb /bin/bash
191-
192-
cd code
193-
```
194-
195-
(Aligns local location with docker image created, allows de-duplicating cloning repo again within image).<br>
196-
Note: Validate container is running `docker container ls`
197-
198-
Step 4: Build & Deploy the binaries
199-
200-
```bash
201-
make
202-
```
203-
204-
Note: Run in case of an unsuccessful build `git config --global --add safe.directory /home/documentdb/code` within image.
205-
206-
```bash
207-
sudo make install
208-
```
209-
210-
Note: To run backend postgresql tests after installing you can run `make check`.
211-
212-
You are all set to work with DocumentDB.
213-
214-
### Using the Prebuilt Docker Image
215-
216-
You can use a [prebuilt docker image](https://github.com/microsoft/documentdb/pkgs/container/documentdb%2Fdocumentdb-oss/versions?filters%5Bversion_type%5D=tagged) for DocumentDB instead of building it from source. Follow these steps:
217-
218-
#### Pull the Prebuilt Image
219-
220-
Pull the prebuilt image directly from the Microsoft Container Registry:
221-
222-
```bash
223-
docker pull ghcr.io/microsoft/documentdb/documentdb-oss:PG16-amd64-0.105.0
224-
```
225-
226-
#### Running the Prebuilt Image
227-
228-
To run the prebuilt image, use one of the following commands:
229-
230-
1. Run the container:
231-
232-
```bash
233-
docker run -dt ghcr.io/microsoft/documentdb/documentdb-oss:PG16-amd64-0.105.0
234-
```
235-
236-
2. If external access is required, run the container with parameter "-e":
237-
238-
```bash
239-
docker run -p 127.0.0.1:9712:9712 -dt ghcr.io/microsoft/documentdb/documentdb-oss:PG16-amd64-0.105.0 -e
240-
```
241-
242-
This will start the container and map port `9712` from the container to the host.
243-
244-
### Connecting to the Server
245-
#### Internal Access
246-
Step 1: Run `start_oss_server.sh` to initialize the DocumentDB server and manage dependencies.
247-
248-
```bash
249-
./scripts/start_oss_server.sh
250-
```
251-
252-
Or logging into the container if using prebuild image
253-
```bash
254-
docker exec -it <container-id> bash
255-
```
256-
257-
Step 2: Connect to `psql` shell
258-
259-
```bash
260-
psql -p 9712 -d postgres
261-
```
262-
263-
#### External Access
264-
Connect to `psql` shell
265-
266-
```bash
267-
psql -h localhost --port 9712 -d postgres -U documentdb
268-
```
269-
270-
## Usage directly through the PostgreSQL layer
271-
272-
Once you have your `DocumentDB` set up running, you can start with creating collections, indexes and perform queries on them.
273-
274-
### Create a collection
275-
276-
DocumentDB provides [documentdb_api.create_collection](https://github.com/microsoft/documentdb/wiki/Functions#create_collection) function to create a new collection within a specified database, enabling you to manage and organize your BSON documents effectively.
277-
278-
```sql
279-
SELECT documentdb_api.create_collection('documentdb','patient');
280-
```
281-
282-
### Perform CRUD operations
283-
284-
#### Insert documents
285-
286-
The [documentdb_api.insert_one](https://github.com/microsoft/documentdb/wiki/Functions#insert_one) command is used to add a single document into a collection.
287-
288-
```sql
289-
select documentdb_api.insert_one('documentdb','patient', '{ "patient_id": "P001", "name": "Alice Smith", "age": 30, "phone_number": "555-0123", "registration_year": "2023","conditions": ["Diabetes", "Hypertension"]}');
290-
select documentdb_api.insert_one('documentdb','patient', '{ "patient_id": "P002", "name": "Bob Johnson", "age": 45, "phone_number": "555-0456", "registration_year": "2023", "conditions": ["Asthma"]}');
291-
select documentdb_api.insert_one('documentdb','patient', '{ "patient_id": "P003", "name": "Charlie Brown", "age": 29, "phone_number": "555-0789", "registration_year": "2024", "conditions": ["Allergy", "Anemia"]}');
292-
select documentdb_api.insert_one('documentdb','patient', '{ "patient_id": "P004", "name": "Diana Prince", "age": 40, "phone_number": "555-0987", "registration_year": "2024", "conditions": ["Migraine"]}');
293-
select documentdb_api.insert_one('documentdb','patient', '{ "patient_id": "P005", "name": "Edward Norton", "age": 55, "phone_number": "555-1111", "registration_year": "2025", "conditions": ["Hypertension", "Heart Disease"]}');
294-
```
295-
296-
#### Read document from a collection
297-
298-
The `documentdb_api.collection` function is used for retrieving the documents in a collection.
299-
300-
```sql
301-
SELECT document FROM documentdb_api.collection('documentdb','patient');
302-
```
303-
304-
Alternatively, we can apply filter to our queries.
305-
306-
```sql
307-
SET search_path TO documentdb_api, documentdb_core;
308-
SET documentdb_core.bsonUseEJson TO true;
309-
310-
SELECT cursorPage FROM documentdb_api.find_cursor_first_page('documentdb', '{ "find" : "patient", "filter" : {"patient_id":"P005"}}');
311-
```
312-
313-
We can perform range queries as well.
314-
315-
```sql
316-
SELECT cursorPage FROM documentdb_api.find_cursor_first_page('documentdb', '{ "find" : "patient", "filter" : { "$and": [{ "age": { "$gte": 10 } },{ "age": { "$lte": 35 } }] }}');
317-
```
318-
319-
#### Update document in a collection
320-
321-
DocumentDB uses the [documentdb_api.update](https://github.com/microsoft/documentdb/wiki/Functions#update) function to modify existing documents within a collection.
322-
323-
The SQL command updates the `age` for patient `P004`.
324-
325-
```sql
326-
select documentdb_api.update('documentdb', '{"update":"patient", "updates":[{"q":{"patient_id":"P004"},"u":{"$set":{"age":14}}}]}');
327-
```
328-
329-
Similarly, we can update multiple documents using `multi` property.
330-
331-
```sql
332-
SELECT documentdb_api.update('documentdb', '{"update":"patient", "updates":[{"q":{},"u":{"$set":{"age":24}},"multi":true}]}');
333-
```
334-
335-
#### Delete document from the collection
336-
337-
DocumentDB uses the [documentdb_api.delete](https://github.com/microsoft/documentdb/wiki/Functions#delete) function for precise document removal based on specified criteria.
338-
339-
The SQL command deletes the document for patient `P002`.
340-
341-
```sql
342-
SELECT documentdb_api.delete('documentdb', '{"delete": "patient", "deletes": [{"q": {"patient_id": "P002"}, "limit": 1}]}');
343-
```
344-
345-
### Collection management
346-
347-
We can review for the available collections and databases by querying [documentdb_api.list_collections_cursor_first_page](https://github.com/microsoft/documentdb/wiki/Functions#list_collections_cursor_first_page).
348-
349-
```sql
350-
SELECT * FROM documentdb_api.list_collections_cursor_first_page('documentdb', '{ "listCollections": 1 }');
351-
```
352-
353-
[documentdb_api.list_indexes_cursor_first_page](https://github.com/microsoft/documentdb/wiki/Functions#list_indexes_cursor_first_page) allows reviewing for the existing indexes on a collection. We can find collection_id from `documentdb_api.list_collections_cursor_first_page`.
354-
355-
```sql
356-
SELECT documentdb_api.list_indexes_cursor_first_page('documentdb','{"listIndexes": "patient"}');
357-
```
358-
359-
`ttl` indexes by default gets scheduled through the `pg_cron` scheduler, which could be reviewed by querying the `cron.job` table.
360-
361-
```sql
362-
select * from cron.job;
363-
```
364-
365-
### Indexing
366-
367-
#### Create an Index
368-
369-
DocumentDB uses the `documentdb_api.create_indexes_background` function, which allows background index creation without disrupting database operations.
370-
371-
The SQL command demonstrates how to create a `single field` index on `age` on the `patient` collection of the `documentdb`.
372-
373-
```sql
374-
SELECT * FROM documentdb_api.create_indexes_background('documentdb', '{ "createIndexes": "patient", "indexes": [{ "key": {"age": 1},"name": "idx_age"}]}');
375-
```
376-
377-
The SQL command demonstrates how to create a `compound index` on fields age and registration_year on the `patient` collection of the `documentdb`.
378-
379-
```sql
380-
SELECT * FROM documentdb_api.create_indexes_background('documentdb', '{ "createIndexes": "patient", "indexes": [{ "key": {"registration_year": 1, "age": 1},"name": "idx_regyr_age"}]}');
381-
```
382-
383-
#### Drop an Index
384-
385-
DocumentDB uses the `documentdb_api.drop_indexes` function, which allows you to remove an existing index from a collection. The SQL command demonstrates how to drop the index named `id_ab_1` from the `first_collection` collection of the `documentdb`.
386-
387-
```sql
388-
CALL documentdb_api.drop_indexes('documentdb', '{"dropIndexes": "patient", "index":"idx_age"}');
389-
```
390-
391-
### Perform aggregations `Group by`
392-
393-
DocumentDB provides the [documentdb_api.aggregate_cursor_first_page](https://github.com/microsoft/documentdb/wiki/Functions#aggregate_cursor_first_page) function, for performing aggregations over the document store.
394-
395-
The example projects an aggregation on number of patients registered over the years.
396-
397-
```sql
398-
SELECT cursorpage FROM documentdb_api.aggregate_cursor_first_page('documentdb', '{ "aggregate": "patient", "pipeline": [ { "$group": { "_id": "$registration_year", "count_patients": { "$count": {} } } } ] , "cursor": { "batchSize": 3 } }');
399-
```
400-
401-
We can perform more complex operations, listing below a few more usage examples.
402-
The example demonstrates an aggregation on patients, categorizing them into buckets defined by registration_year boundaries.
403-
404-
```sql
405-
SELECT cursorpage FROM documentdb_api.aggregate_cursor_first_page('documentdb', '{ "aggregate": "patient", "pipeline": [ { "$bucket": { "groupBy": "$registration_year", "boundaries": ["2022","2023","2024"], "default": "unknown" } } ], "cursor": { "batchSize": 3 } }');
406-
```
407-
408-
This query performs an aggregation on the `patient` collection to group documents by `registration_year`. It collects unique patient conditions for each registration year using the `$addToSet` operator.
409-
410-
```sql
411-
SELECT cursorpage FROM documentdb_api.aggregate_cursor_first_page('documentdb', '{ "aggregate": "patient", "pipeline": [ { "$group": { "_id": "$registration_year", "conditions": { "$addToSet": { "conditions" : "$conditions" } } } } ], "cursor": { "batchSize": 3 } }');
412-
```
413-
414-
### Join data from multiple collections
415-
416-
Let's create an additional collection named `appointment` to demonstrate how a join operation can be performed.
417-
418-
```sql
419-
select documentdb_api.insert_one('documentdb','appointment', '{"appointment_id": "A001", "patient_id": "P001", "doctor_name": "Dr. Milind", "appointment_date": "2023-01-20", "reason": "Routine checkup" }');
420-
select documentdb_api.insert_one('documentdb','appointment', '{"appointment_id": "A002", "patient_id": "P001", "doctor_name": "Dr. Moore", "appointment_date": "2023-02-10", "reason": "Follow-up"}');
421-
select documentdb_api.insert_one('documentdb','appointment', '{"appointment_id": "A004", "patient_id": "P003", "doctor_name": "Dr. Smith", "appointment_date": "2024-03-12", "reason": "Allergy consultation"}');
422-
select documentdb_api.insert_one('documentdb','appointment', '{"appointment_id": "A005", "patient_id": "P004", "doctor_name": "Dr. Moore", "appointment_date": "2024-04-15", "reason": "Migraine treatment"}');
423-
select documentdb_api.insert_one('documentdb','appointment', '{"appointment_id": "A007","patient_id": "P001", "doctor_name": "Dr. Milind", "appointment_date": "2024-06-05", "reason": "Blood test"}');
424-
select documentdb_api.insert_one('documentdb','appointment', '{ "appointment_id": "A009", "patient_id": "P003", "doctor_name": "Dr. Smith","appointment_date": "2025-01-20", "reason": "Follow-up visit"}');
425-
```
426-
427-
The example presents each patient along with the doctors visited.
428-
429-
```sql
430-
SELECT cursorpage FROM documentdb_api.aggregate_cursor_first_page('documentdb', '{ "aggregate": "patient", "pipeline": [ { "$lookup": { "from": "appointment","localField": "patient_id", "foreignField": "patient_id", "as": "appointment" } },{"$unwind":"$appointment"},{"$project":{"_id":0,"name":1,"appointment.doctor_name":1,"appointment.appointment_date":1}} ], "cursor": { "batchSize": 3 } }');
431-
```
432-
433-
### Community
434-
435-
- Checkout our website at https://documentdb.io to stay up to date with the latest docs and blogs.
436-
- Please refer to page for contributing to our [Roadmap list](https://github.com/orgs/microsoft/projects/1407/views/1).
437-
- [FerretDB](https://github.com/FerretDB/FerretDB) integration allows using DocumentDB as backend engine.
167+
- Check out our [website](https://documentdb.io) to stay up to date with the latest on the project.
168+
- Check out our [docs](https://documentdb.io/docs) for MongoDB API compatibility, quickstarts and more.
438169
- Contributors and users can join the [DocumentDB Discord channel](https://discord.gg/vH7bYu524D) for quick collaboration.
170+
- Check out [FerretDB](https://github.com/FerretDB/FerretDB) and their integration of DocumentDB as a backend engine.

0 commit comments

Comments
 (0)