$❯ CrowderSoup

A website about programming, technology, and life.

Basic Analytics in a Django App

by Aaron Crowder on

I recently started my journey of rebuilding this blog using Django. It's a learning experience for me, with the side benefit of nudging me to write more. Something I wanted to add was some super basic analytics to help make sure this site is actually being used by someone other than me.

I wanted to see things like a (somewhat naive) count of unique visitors, total page views, where requests were coming from (both referrer and geographically), and response codes (mostly so I can find 404s).

The last thing I wanted was to install some third-party tracking. I can grab all of this data myself from each request and just log it. From there I only need a simple dashboard.

So I decided to build it. Here it is:

I also decided that since I'm now logging this data I should include a privacy policy for this blog.

How?

Django makes it really easy to track all this using middleware. But let's start with the model.

Model

What we're tracking is a Visit.

class Visit(models.Model):
    session_key = models.CharField(max_length=40, db_index=True, blank=True, null=True)
    user = models.ForeignKey(
        settings.AUTH_USER_MODEL, null=True, blank=True, on_delete=models.SET_NULL
    )
    ip_address = models.GenericIPAddressField(null=True, blank=True)
    user_agent = models.TextField(blank=True)
    path = models.CharField(max_length=512)
    referrer = models.TextField(blank=True)
    started_at = models.DateTimeField(auto_now_add=True)
    ended_at = models.DateTimeField(null=True, blank=True)
    duration_seconds = models.PositiveIntegerField(null=True, blank=True)
    country = models.CharField(max_length=2, blank=True)
    region = models.CharField(max_length=64, blank=True)
    city = models.CharField(max_length=128, blank=True)
    response_status_code = models.IntegerField(null=True, blank=True)

    def __str__(self):
        return self.path

    class Meta:
        indexes = [
            models.Index(fields=["path"]),
            models.Index(fields=["session_key", "started_at"]),
        ]

This model captures everything I want to track about a visit. Each Visit is a page view, and I can get a naive idea of unique visits from the session_key. Each session has a key, so grouping by session_key gives me unique sessions.

The country, region, and city fields are guessed from the request's IP address.

Middleware

The middleware is where we actually collect the data.

class AnalyticsMiddleware(MiddlewareMixin):
    def process_request(self, request):
        if request.path.startswith("/admin"):
            return
        request._analytics_start_ts = time.time()

    def process_response(self, request, response):
        try:
            if request.path.startswith("/admin"):
                return response

            started_ts = getattr(request, "_analytics_start_ts", None)
            if started_ts is None:
                return response

            duration = int(time.time() - started_ts)

            session_key = getattr(request, "session", None) and request.session.session_key
            if session_key is None and hasattr(request, "session"):
                request.session.save()
                session_key = request.session.session_key

            ip = get_client_ip(request)
            geo = geolocate_ip(ip) if ip else {}

            visit = Visit.objects.create(
                ... # Fill out the visit with collected data
            )

            request.visit_id = visit.id
        except DatabaseError:
            conn = transaction.get_connection()
            if conn.in_atomic_block:
                transaction.set_rollback(False)
        except Exception:
            pass

        return response

This middleware gathers everything we need in two steps. process_request runs early and sets a timestamp so we can track roughly how long the request took. process_response collects the rest of the data and creates the Visit row.

Once the visit is stored, I store the visit_id on the request. This lets me attach extra client-side events later, like how long someone stayed on the page or what they clicked. That flexibility is a big reason I rolled my own instead of using an existing analytics tool. I also just don’t want any of this data shared with a third party.

The whole thing is wrapped in a try/catch so analytics failures never take down the site.

Admin View (Dashboard)

The admin view, which you saw a screenshot of above, is where everything gets visualized. In my analytics app I have this in admin.py:

@admin.register(Visit)
class VisitAdmin(ModelAdmin):
    list_display = ("path", "session_key", "user", "response_status_code", "started_at")
    search_fields = ("path", "session_key", "user__username", "ip_address")
    list_filter = ("response_status_code", "country", "region")

    change_list_template = "admin/visit_dashboard.html"

    def changelist_view(self, request, extra_context=None):
        qs = self.get_queryset(request)

        stats = qs.aggregate(
            total_page_views=Count("id"),
            unique_sessions=Count("session_key", distinct=True),
            unique_users=Count("user", distinct=True),
            unique_ips=Count("ip_address", distinct=True),
        )

        visitors_by_country = (
            qs.values("country")
            .exclude(country="")
            .annotate(count=Count("id"))
            .order_by("-count")
        )

        error_visits = (
            qs.filter(response_status_code__gte=400)
            .select_related("user")
            .order_by("-started_at")[:10]
        )

        extra_context = extra_context or {}
        extra_context.update(
            stats=stats,
            visitors_by_country=visitors_by_country,
            error_visits=error_visits,
        )
        return super().changelist_view(request, extra_context=extra_context)

This overrides the default admin template for the changelist, pulls together the data I want, and sends it all to the template. The template itself is simple:

{% extends "admin/change_list.html" %}

{% block result_list %}
<!-- Custom dashboard markup here -->
{{ block.super }} 
{% endblock %}

This is one of the things I love about Django. The built-in admin makes it incredibly easy to build a blog without having to design an admin UI. But when I do want to customize it, the admin gets out of the way and lets me build whatever I want.


Wrapping Up

This whole setup is pretty lightweight, but it gives me exactly what I want: visibility into how the site is being used, with full control over the data and no third-party scripts following people around. It also gives me room to expand later with front-end event tracking or more detailed dashboards.

If you're using Django and want simple, privacy-friendly analytics without the overhead of a full analytics service, this approach works great and is fun to build.