Table of Contents
NoSQL Integration with Django: MongoDB
NoSQL databases have gained popularity in recent years due to their ability to handle large and unstructured data efficiently. MongoDB is one such NoSQL database that is widely used in the industry. In this section, we will explore how to integrate MongoDB with Django and leverage its capabilities for big data management.
To integrate MongoDB with Django, we need to install the djongo
package, which provides a seamless interface between Django and MongoDB. Here's how you can install it:
pip install djongo
Once installed, you need to configure your Django settings to use MongoDB as the database backend. Update the DATABASES
section in your settings.py
file as follows:
DATABASES = { 'default': { 'ENGINE': 'djongo', 'NAME': 'your_database_name', 'HOST': 'your_mongodb_host', 'PORT': your_mongodb_port, 'USER': 'your_mongodb_user', 'PASSWORD': 'your_mongodb_password', } }
Now, you can define your models in Django using the familiar models.py
file. The only difference is that you need to use the EmbeddedModelField
and ArrayModelField
provided by djongo
for embedding and arrays in MongoDB. Here's an example:
from djongo import models class Author(models.Model): name = models.CharField(max_length=100) class Book(models.Model): title = models.CharField(max_length=100) authors = models.ArrayModelField( model_container=Author, ) publication_year = models.IntegerField() class Meta: abstract = True class Library(models.Model): books = models.ArrayModelField( model_container=Book, ) location = models.CharField(max_length=100)
In the above example, we have defined three models: Author
, Book
, and Library
. The ArrayModelField
is used to store arrays of embedded models.
Now, you can perform CRUD operations on your MongoDB database using Django's ORM. For example, to create a new book with authors and add it to the library, you can do the following:
author1 = Author(name='John Doe') author2 = Author(name='Jane Smith') book = Book(title='Sample Book', authors=[author1, author2], publication_year=2022) library = Library(books=[book], location='New York') library.save()
This will save the Library
object along with its associated Book
and Author
objects in MongoDB.
Related Article: Python Typing Module Tutorial: Use Cases and Code Snippets
NoSQL Integration with Django: Cassandra
Cassandra is another popular NoSQL database that is known for its scalability and high availability. Integrating Cassandra with Django allows us to leverage its distributed architecture for managing big data. In this section, we will explore how to integrate Cassandra with Django and perform CRUD operations on the database.
To integrate Cassandra with Django, we need to install the django-cassandra-engine
package, which provides the necessary tools and interfaces. Here's how you can install it:
pip install django-cassandra-engine
Once installed, you need to configure your Django settings to use Cassandra as the database backend. Update the DATABASES
section in your settings.py
file as follows:
DATABASES = { 'default': { 'ENGINE': 'django_cassandra_engine', 'NAME': 'your_keyspace_name', 'TEST_NAME': 'your_test_keyspace_name', 'HOST': 'your_cassandra_host', 'PORT': your_cassandra_port, } }
Now, you can define your models in Django using the familiar models.py
file. The only difference is that you need to use the CassandraModel
provided by django-cassandra-engine
. Here's an example:
from django_cassandra_engine.models import CassandraModel from django_cassandra_engine.fields import Text, Integer class Book(CassandraModel): id = Integer(primary_key=True) title = Text() authors = Text() publication_year = Integer()
In the above example, we have defined a Book
model with four fields: id
, title
, authors
, and publication_year
. The primary_key
attribute is used to specify the primary key for the model.
Now, you can perform CRUD operations on your Cassandra database using Django's ORM. For example, to create a new book and save it to the database, you can do the following:
book = Book(id=1, title='Sample Book', authors='John Doe, Jane Smith', publication_year=2022) book.save()
This will save the Book
object in Cassandra.
Pagination Techniques in Django
When dealing with large datasets in Django, pagination becomes crucial to ensure optimal performance and user experience. In this section, we will explore different pagination techniques that can be used in Django to efficiently handle large datasets.
One of the most common pagination techniques in Django is the use of the Paginator
class provided by the django.core.paginator
module. This class allows you to split a queryset into smaller chunks or pages, making it easier to navigate and display data.
Here's an example of how to use the Paginator
class:
from django.core.paginator import Paginator # Assuming 'queryset' is your original queryset paginator = Paginator(queryset, per_page=10) page_number = request.GET.get('page') page_obj = paginator.get_page(page_number)
In the above example, we create a Paginator
object by passing in the original queryset and the number of items to display per page (in this case, 10). We then get the current page number from the request's GET parameters and use the get_page()
method to retrieve the corresponding page object.
Once you have the page object, you can access the data for that page using the object_list
attribute. Additionally, the has_previous()
, previous_page_number()
, has_next()
, and next_page_number()
methods can be used to navigate between pages.
for item in page_obj.object_list: # Do something with each item
The Paginator
class also provides other useful methods, such as count()
to get the total number of items in the queryset, num_pages
to get the total number of pages, and page_range
to get a list of all page numbers.
Another pagination technique in Django is the use of cursor-based pagination. This technique is particularly useful when dealing with very large datasets, as it allows you to efficiently retrieve and display data without relying on offsets or limits.
To implement cursor-based pagination, you can use the CursorPaginator
class provided by the django_cursor_pagination
package. This package is not included in Django by default, so you need to install it separately:
pip install django-cursor-pagination
Once installed, you can use the CursorPaginator
class in a similar way to the Paginator
class:
from django_cursor_pagination import CursorPaginator # Assuming 'queryset' is your original queryset paginator = CursorPaginator(queryset, per_page=10) cursor = request.GET.get('cursor') page_obj = paginator.get_page(cursor)
In the above example, we create a CursorPaginator
object by passing in the original queryset and the number of items to display per page. We then get the current cursor value from the request's GET parameters and use the get_page()
method to retrieve the corresponding page object.
Cursor-based pagination offers several advantages over traditional offset-based pagination. It eliminates the need to calculate offsets, which can be expensive for large datasets. It also provides better performance when navigating between pages, as it only retrieves the necessary data.
Filtering Large Datasets in Django
When working with large datasets in Django, filtering becomes crucial to extract the relevant information efficiently. In this section, we will explore different filtering techniques that can be used in Django to handle large datasets effectively.
Django provides a rich set of filtering options through the use of the filter()
method on querysets. This method allows you to specify conditions to narrow down the results based on specific field values.
Here's an example of how to use the filter()
method:
# Assuming 'ModelName' is the name of your model objects = ModelName.objects.filter(field_name=value)
In the above example, we filter the queryset based on a specific field name and its corresponding value. This will return a new queryset containing only the objects that match the specified condition.
You can also chain multiple filter conditions together to create more complex queries. Django uses the logical AND operator by default to combine multiple filters.
# Assuming 'ModelName' is the name of your model objects = ModelName.objects.filter(field1=value1).filter(field2=value2)
In the above example, we filter the queryset based on two different field names and their corresponding values. This will return a new queryset containing only the objects that match both conditions.
Django also provides various lookup types that can be used with the filter()
method to perform more specific filtering operations. For example, you can use the contains
lookup to filter objects based on a substring match:
# Assuming 'ModelName' is the name of your model objects = ModelName.objects.filter(field__contains='substring')
In the above example, we filter the queryset based on the field
containing a specific substring. This will return a new queryset containing only the objects that match the condition.
Other useful lookup types include exact
, iexact
, startswith
, istartswith
, endswith
, iendswith
, in
, gt
, gte
, lt
, lte
, and more. You can find a complete list of lookup types and their usage in the Django documentation.
Additionally, Django provides the Q
object, which allows you to perform complex OR queries. This is useful when you want to filter objects based on multiple conditions, where at least one condition needs to be true.
from django.db.models import Q # Assuming 'ModelName' is the name of your model objects = ModelName.objects.filter(Q(field1=value1) | Q(field2=value2))
In the above example, we filter the queryset based on two different field names and their corresponding values using the OR operator. This will return a new queryset containing objects that match at least one of the conditions.
Related Article: Converting Integer Scalar Arrays To Scalar Index In Python
Optimizing Performance in Django
Optimizing the performance of a Django application is essential, especially when dealing with big data. In this section, we will explore various techniques and best practices to optimize the performance of your Django application.
1. Use database indexes: Indexes play a crucial role in improving the performance of database queries. By indexing the fields that are frequently used in the WHERE clause, you can significantly speed up query execution. Django provides a convenient way to define indexes on model fields using the db_index
attribute.
class MyModel(models.Model): field1 = models.CharField(max_length=100, db_index=True) # ...
2. Use select_related() and prefetch_related(): These methods allow you to optimize database queries by reducing the number of database hits. select_related()
performs a join between related tables, while prefetch_related()
fetches related objects using a separate query. By using these methods, you can minimize the number of database round-trips and improve performance.
# Assuming 'ModelName' is the name of your model and 'related_field' is a related field objects = ModelName.objects.select_related('related_field')
3. Use caching: Caching is a useful technique to reduce the load on your database and improve response times. Django provides built-in support for caching through the cache
framework. You can cache the results of expensive database queries, view functions, or even entire web pages to serve them faster.
from django.core.cache import cache def get_data(): data = cache.get('data') if data is None: data = expensive_database_query() cache.set('data', data, timeout=3600) # Cache for 1 hour return data
4. Use pagination: When dealing with large datasets, it is essential to implement pagination to avoid loading all the data at once. As discussed earlier, Django provides the Paginator
class to split querysets into smaller chunks or pages. By paginating the data, you can improve performance and provide a better user experience.
5. Optimize database queries: Analyzing and optimizing your database queries can have a significant impact on performance. Django provides a useful ORM that abstracts away the underlying database, but it is still important to understand how your queries translate to SQL. You can use tools like Django Debug Toolbar or EXPLAIN statements to identify and optimize slow queries.
6. Use caching at the view level: In addition to caching individual pieces of data, you can also cache entire views to improve performance. Django provides the cache_page
decorator, which allows you to cache the output of a view function for a specified duration.
from django.views.decorators.cache import cache_page @cache_page(60 * 15) # Cache for 15 minutes def my_view(request): # ...
7. Use asynchronous views: Asynchronous views can significantly improve the performance of your Django application, especially when dealing with I/O-bound operations. Django provides support for asynchronous views using the async
and await
keywords, allowing you to handle multiple requests concurrently.
8. Use database connection pooling: Connection pooling can help improve the performance of your Django application by reusing database connections instead of creating new ones for each request. Django provides support for connection pooling through third-party packages like django-db-pool
.
9. Use caching at the template level: Django provides a template fragment caching mechanism that allows you to cache parts of your templates. By caching frequently used or computationally expensive parts of your templates, you can improve the rendering performance of your views.
10. Profile and optimize your code: It's important to profile your Django application to identify bottlenecks and areas that can be optimized. Use tools like Django Silk or Django Debug Toolbar to profile your code and identify areas that can be optimized for better performance.
Handling Streaming Data in Django
Streaming data refers to a continuous flow of data that is generated and processed in real-time. In this section, we will explore how to handle streaming data in Django and leverage asynchronous views for better performance.
Django provides support for handling streaming data through the use of Django Channels, an official extension that allows you to build real-time applications with Django. Channels provides a way to handle long-lived connections, such as WebSockets, and enables bidirectional communication between the server and the client.
To handle streaming data in Django, you need to install the channels
package and configure your Django settings to use Channels as the backend for handling WebSocket connections. Here's how you can install Channels:
pip install channels
Once installed, you need to add Channels to your Django project's INSTALLED_APPS
and configure the routing for WebSocket connections. Create a routing.py
file in your project's root directory and define the WebSocket routes:
from channels.routing import ProtocolTypeRouter, URLRouter from django.urls import path from myapp.consumers import MyConsumer application = ProtocolTypeRouter({ 'http': get_asgi_application(), 'websocket': URLRouter([ path('ws/my_consumer/', MyConsumer.as_asgi()), ]), })
In the above example, we define a WebSocket route for the MyConsumer
consumer. The consumer is responsible for handling WebSocket connections and processing streaming data.
Next, create a consumers.py
file in your app directory and define the MyConsumer
class:
from channels.generic.websocket import AsyncWebsocketConsumer class MyConsumer(AsyncWebsocketConsumer): async def connect(self): await self.accept() async def disconnect(self, close_code): pass async def receive(self, text_data): # Process received data pass
In the above example, we define the MyConsumer
class that inherits from AsyncWebsocketConsumer
. The connect()
method is called when a WebSocket connection is established, the disconnect()
method is called when the connection is closed, and the receive()
method is called when data is received from the client.
To handle streaming data, you can process the received data in the receive()
method and send it back to the client using the send()
method:
async def receive(self, text_data): # Process received data processed_data = process_data(text_data) # Send processed data back to the client await self.send(processed_data)
With Channels, you can also use groups to handle multiple WebSocket connections simultaneously. This is useful when you want to broadcast data to multiple clients or perform real-time updates.
from channels.layers import get_channel_layer from asgiref.sync import async_to_sync channel_layer = get_channel_layer() # Add a client to a group async_to_sync(channel_layer.group_add)('group_name', self.channel_name) # Remove a client from a group async_to_sync(channel_layer.group_discard)('group_name', self.channel_name) # Send data to a group async_to_sync(channel_layer.group_send)('group_name', { 'type': 'process_data', 'data': 'some_data', }) # Receive data in a group async def process_data(self, event): data = event['data'] # Process data and send it back to the client await self.send(data)
In the above example, we use the channel_layer
to manage groups and send/receive data. We add a client to a group, remove a client from a group, send data to a group, and receive data in a group.
Benefits of Asynchronous Views in Django
Asynchronous views in Django offer several benefits, especially when dealing with I/O-bound operations and handling large datasets. In this section, we will explore the benefits of using asynchronous views in Django and how they can improve the performance of your application.
1. Improved performance: Asynchronous views allow you to handle multiple requests concurrently, without blocking the main thread. This means that your Django application can continue to process other requests while waiting for I/O operations to complete. As a result, you can achieve better performance and responsiveness, especially when dealing with slow or long-running operations.
2. Better scalability: By using asynchronous views, you can handle a larger number of concurrent requests without the need for additional resources. Since asynchronous views are non-blocking, they allow your Django application to make more efficient use of system resources, resulting in better scalability and the ability to handle high traffic loads.
3. Reduced resource consumption: Asynchronous views consume fewer system resources compared to traditional synchronous views. This is because they do not tie up system threads while waiting for I/O operations to complete. As a result, your Django application can handle more requests with the same amount of resources, leading to improved resource utilization and cost-effectiveness.
4. Simplified code: Asynchronous views in Django use the async
and await
keywords, which provide a more natural and readable way to write asynchronous code. This makes it easier to handle complex I/O operations, such as network requests or database queries, without resorting to complicated callback functions or thread management.
5. Seamless integration with other asynchronous libraries: Django's support for asynchronous views allows you to seamlessly integrate with other asynchronous libraries and frameworks, such as asyncio or aiohttp. This gives you the flexibility to choose the best tools for your specific use case and take advantage of the extensive ecosystem of asynchronous Python libraries.
6. Improved user experience: Asynchronous views can greatly improve the user experience of your Django application, especially when dealing with long-running operations or real-time updates. By offloading time-consuming tasks to background processes and providing real-time updates through WebSockets or server-sent events, you can create a more interactive and engaging user interface.
It's important to note that not all parts of your Django application need to be implemented using asynchronous views. Asynchronous views are most effective when dealing with I/O-bound operations, such as network requests or database queries. For CPU-bound operations, such as complex computations or heavy data processing, traditional synchronous views may still be more appropriate.
Integrating Hadoop with Django for Big Data Analytics
Hadoop is a popular open-source framework for distributed storage and processing of large datasets. Integrating Hadoop with Django allows you to leverage its useful capabilities for big data analytics. In this section, we will explore how to integrate Hadoop with Django and perform big data analytics.
To integrate Hadoop with Django, you need to install the hdfs
package, which provides a Python interface to interact with Hadoop Distributed File System (HDFS). Here's how you can install it:
pip install hdfs
Once installed, you can use the hdfs
package to interact with Hadoop from your Django application. For example, you can read data from HDFS, write data to HDFS, or perform MapReduce jobs.
Here's an example of how to read data from HDFS:
from hdfs import InsecureClient # Create an HDFS client client = InsecureClient('http://your_hadoop_host:50070', user='your_hadoop_user') # Read a file from HDFS with client.read('/path/to/file.txt') as file: data = file.read() # Process the data
In the above example, we create an InsecureClient
object by providing the Hadoop host URL and the username. We then use the read()
method to read a file from HDFS and process the data.
Similarly, you can use the write()
method to write data to HDFS:
from hdfs import InsecureClient # Create an HDFS client client = InsecureClient('http://your_hadoop_host:50070', user='your_hadoop_user') # Write data to HDFS with client.write('/path/to/file.txt') as file: file.write('data')
In the above example, we create an InsecureClient
object and use the write()
method to write data to a file in HDFS.
You can also perform MapReduce jobs using Hadoop Streaming. Hadoop Streaming allows you to write MapReduce jobs in any programming language that can read from standard input and write to standard output. You can use Python to write MapReduce jobs and execute them on Hadoop.
Here's an example of a simple MapReduce job written in Python:
from hdfs import InsecureClient # Create an HDFS client client = InsecureClient('http://your_hadoop_host:50070', user='your_hadoop_user') # Upload input file to HDFS client.upload('/input/file.txt', 'input.txt') # Define the MapReduce job job = client.run_job('/path/to/hadoop-streaming.jar', input_paths='/input/file.txt', output_path='/output', mapper='mapper.py', reducer='reducer.py') # Wait for the job to complete job.wait_for_completion() # Download the output file from HDFS client.download('/output/part-00000', 'output.txt')
In the above example, we upload an input file to HDFS, define the MapReduce job parameters, run the job using the run_job()
method, wait for the job to complete using the wait_for_completion()
method, and download the output file from HDFS.
Related Article: How to Unzip Files in Python
Integrating Spark with Django for Big Data Analytics
Apache Spark is a fast and general-purpose cluster computing system that provides useful tools for big data processing and analytics. Integrating Spark with Django allows you to leverage its distributed computing capabilities for big data analytics. In this section, we will explore how to integrate Spark with Django and perform big data analytics.
To integrate Spark with Django, you need to install the pyspark
package, which provides a Python interface to interact with Spark. Here's how you can install it:
pip install pyspark
Once installed, you can use the pyspark
package to interact with Spark from your Django application. For example, you can read data from various data sources, perform data transformations, and run distributed computations.
Here's an example of how to read data from a CSV file using Spark:
from pyspark.sql import SparkSession # Create a Spark session spark = SparkSession.builder.appName('my_app').getOrCreate() # Read data from a CSV file df = spark.read.csv('/path/to/file.csv', header=True, inferSchema=True)
In the above example, we create a Spark session using the SparkSession
class, specifying the application name. We then use the read.csv()
method to read data from a CSV file into a DataFrame.
Once you have the data in a DataFrame, you can perform various transformations and computations. For example, you can filter rows based on a condition, aggregate data, or join multiple DataFrames.
# Filter rows based on a condition filtered_df = df.filter(df['column'] > 10) # Aggregate data aggregated_df = df.groupBy('column').agg({'column': 'sum'}) # Join multiple DataFrames joined_df = df1.join(df2, on='column')
In the above examples, we filter rows based on a condition, aggregate data by summing a column, and join two DataFrames based on a common column.
Spark also provides support for running distributed computations using the RDD (Resilient Distributed Dataset) API. RDDs are a fundamental data structure in Spark that allow for efficient distributed processing.
Here's an example of how to perform a word count using RDDs:
from pyspark import SparkContext # Create a Spark context sc = SparkContext(appName='my_app') # Create an RDD from a text file rdd = sc.textFile('/path/to/file.txt') # Perform word count word_count = rdd.flatMap(lambda line: line.split(' ')) \ .map(lambda word: (word, 1)) \ .reduceByKey(lambda a, b: a + b) # Collect the results results = word_count.collect()
In the above example, we create a Spark context using the SparkContext
class, specifying the application name. We then create an RDD from a text file using the textFile()
method and perform a word count using the flatMap()
, map()
, and reduceByKey()
methods. Finally, we collect the results using the collect()
method.
Implementing Data Warehousing in Django-based Applications
Data warehousing is a process of collecting, storing, and managing data from various sources to provide business intelligence and support decision-making. In this section, we will explore how to implement data warehousing in Django-based applications.
Django provides a useful ORM (Object-Relational Mapping) that allows you to define and manage your database schema using Python code. To implement data warehousing in Django, you can use the ORM to define the necessary models and relationships.
Here's an example of how to define a data warehouse model in Django:
from django.db import models class FactSales(models.Model): date = models.DateField() product = models.ForeignKey('Product', on_delete=models.CASCADE) region = models.ForeignKey('Region', on_delete=models.CASCADE) quantity = models.IntegerField() amount = models.DecimalField(max_digits=10, decimal_places=2) class Product(models.Model): name = models.CharField(max_length=100) category = models.ForeignKey('Category', on_delete=models.CASCADE) class Region(models.Model): name = models.CharField(max_length=100) class Category(models.Model): name = models.CharField(max_length=100)
In the above example, we define a FactSales
model that represents the fact table in our data warehouse. It contains foreign keys to the Product
and Region
models, which represent the dimension tables. The Product
model has a foreign key to the Category
model, representing another dimension.
Once you have defined your data warehouse models, you can use Django's migration system to create the necessary database tables. Run the following command to generate the migration files:
python manage.py makemigrations
Then, apply the migrations to create the tables:
python manage.py migrate
With the tables in place, you can start populating your data warehouse by importing data from various sources. This can be done using Django's ORM or by writing custom scripts to import data.
For example, let's say you have a CSV file containing sales data. You can write a script to read the CSV file and populate the FactSales
table using Django's ORM:
import csv from datetime import datetime from myapp.models import FactSales, Product, Region with open('sales.csv', 'r') as file: reader = csv.reader(file) next(reader) # Skip header row for row in reader: date = datetime.strptime(row[0], '%Y-%m-%d').date() product = Product.objects.get(name=row[1]) region = Region.objects.get(name=row[2]) quantity = int(row[3]) amount = decimal.Decimal(row[4]) FactSales.objects.create(date=date, product=product, region=region, quantity=quantity, amount=amount)
In the above example, we read the CSV file row by row, convert the date string to a date
object, and retrieve the corresponding Product
and Region
objects using their names. We then create a new FactSales
object and save it to the database.
Once your data warehouse is populated, you can use Django's ORM to query and analyze the data. For example, you can perform aggregations, filter data based on specific criteria, or join multiple tables.
from django.db.models import Sum # Total sales amount by region total_sales = FactSales.objects.values('region').annotate(total_amount=Sum('amount')) # Sales by product category sales_by_category = FactSales.objects.values('product__category').annotate(total_amount=Sum('amount')) # Sales by region and category sales_by_region_category = FactSales.objects.values('region__name', 'product__category__name').annotate(total_amount=Sum('amount'))
In the above examples, we use Django's ORM to perform aggregations on the FactSales
table, grouping the data by region, product category, or both. The values()
method is used to specify the fields to group by, and the annotate()
method is used to perform the aggregation.
ETL Processes in Django-based Applications
ETL (Extract, Transform, Load) is a process used to collect data from various sources, transform it into a consistent format, and load it into a target system. In this section, we will explore how to implement ETL processes in Django-based applications.
Django provides a useful ORM (Object-Relational Mapping) that allows you to define and manage your database schema using Python code. To implement ETL processes in Django, you can use the ORM to extract data from various sources, transform it, and load it into your target system.
Here's an example of how to implement an ETL process in Django:
from myapp.models import SourceModel, TargetModel # Extract data from the source source_data = SourceModel.objects.all() # Transform the data transformed_data = [] for item in source_data: transformed_item = { 'field1': item.field1, 'field2': item.field2, # Perform transformations on the fields } transformed_data.append(transformed_item) # Load the data into the target for item in transformed_data: target_item = TargetModel(**item) target_item.save()
In the above example, we extract data from the SourceModel
using Django's ORM, perform transformations on the fields, and load the transformed data into the TargetModel
.
Depending on your specific requirements, the extraction step can involve reading data from various sources, such as databases, APIs, or CSV files. Django's ORM provides support for connecting to different databases and fetching data using the familiar queryset syntax.
For example, to extract data from a MySQL database, you can define a model in Django that represents the table you want to extract data from:
from django.db import models class SourceModel(models.Model): field1 = models.CharField(max_length=100) field2 = models.IntegerField() # ...
Once you have defined the model, you can use Django's ORM to fetch the data:
from myapp.models import SourceModel source_data = SourceModel.objects.all()
The transformation step involves manipulating the extracted data to meet the requirements of the target system. This can include cleaning up data, performing calculations, or combining multiple fields.
In the above example, we perform transformations on the fields by creating a new dictionary with the transformed values. The transformed data is stored in a list, which can later be loaded into the target system.
Finally, the load step involves inserting the transformed data into the target system. This can be done using Django's ORM by creating instances of the target model and saving them to the database.
In the above example, we create new instances of the TargetModel
using the transformed data and save them to the database using the save()
method.
Additional Resources
- Pagination in Django