Halfway Point Reflection (9/30/25)

Halfway through this course, I now better understand significantly both the theory and the practice of database systems. Here are five important takeaways of the first portion of the class:

1. The Relational Model and Its Relevance

    I learned how the relational model forms the basis of contemporary data systems, structuring data into tables with rows and columns. The model stresses data integrity and relationships to keep large datasets meaningfully and regularly arranged.

    2. SQL Essentials: One-Table Queries and Constraints

    Early on in the course, I felt experienced with crafting simple SELECT, INSERT, UPDATE, and DELETE statements. I learned to specify primary keys, foreign keys, and constraints to apply rules to the data, preventing database inconsistency.

    3. Advanced SQL Features: Aggregation, Joins, Subqueries, and Views

    Building on the fundamentals, I learned how to join data across multiple tables using INNER, LEFT, and various other joins. I also experimented with querying with aggregation functions such as SUM, COUNT, and AVG, and using subqueries to divide large problems into smaller pieces. Defining and querying views revealed to me how SQL can hide the logic and make repetitive or difficult to understand queries run smoothly.

    4. Data Storage and Indexing in Databases

    In Week 3, I learned how data is actually stored physically in heap tables and how free space is kept. The explanation of indexes was particularly valuable, as now I understand the distinction between ordered indexes (like B-trees) and hash indexes and how this fundamentally impacts the speed of a query.

    5. Entity-Relationship (ER) Model and Normal Forms

    Recently, I learned to design databases with ER diagrams and to Normalize them with the help of normal forms. The Normalization guarantees that data reduplication is kept to only necessary minimum and that the structure can efficiently allow querying while maintaining data integrity.

    Questions I Have About Databases

    1. Practical Limits of Normalization:

    When is normalization too much and therefore bad for performance, and when can one safely denormalize a design?

    2. Indexing Strategy:

    Although I get the general idea about ordered and hash indexes, I always have a question: what’s the optimal way of selecting appropriate index types with varying patterns of queries and how DBAs make trade-offs within practical systems?

    3. Scalability and New Systems:

    Relational DB principles how to they apply to extremely large-scale systems (e.g., cloud databases, distributed databases) and particularly how features such as joins and indexes respond to data that’s being distributed across many servers?

    Leave a Comment

    Your email address will not be published. Required fields are marked *