Building Spotify-Style Search with Algolia in Go
Table of Contents
Eight years ago, I built Lavafoshi - a local music streaming app for the Maldives. Fresh out of college with just one year of professional experience, I was eager to create something meaningful. Looking back now, as I prepare to release a major update, I realize how much I’ve learned about building software.
The original search was built with what every junior developer reaches for: SQL LIKE
queries with manual relevance scoring. It worked for a few hundred songs, but as our catalog grew to tens of thousands of tracks, the cracks began to show. Response times crept up, relevance suffered, and users started complaining. It was time for a complete overhaul.
This is the story of how I migrated from SQL-based search to Algolia, achieving sub-50ms response times while building a Spotify-quality search experience. I’ll share the architectural decisions, technical challenges, and hard-won lessons that might save you from the pitfalls I encountered.
The breaking point: When SQL isn’t enough
Before diving into the solution, let me paint a picture of what I was dealing with. Our original search looked something like this:
SELECT * FROM songs
WHERE name LIKE '%user_query%'
ORDER BY monthly_listeners DESC
LIMIT 50;
Simple, right? And it worked—until it didn’t. With over 10,000 songs, basic LIKE
queries were taking more time than I was comfortable with. Users expect search to be instant, especially when they’re used to Spotify’s lightning-fast responses.
The real challenge wasn’t just speed, though. We needed a search experience that could:
- Handle typos gracefully (“Naashid” → “Nashid”)
- Search across multiple content types simultaneously (songs, artists, albums, playlists)
- Provide intelligent ranking based on popularity and relevance
- Support real-time suggestions and faceted filtering
- Scale to handle concurrent users without breaking a sweat
SQL simply wasn’t designed for this kind of search complexity.
Choosing the right tool: Meilisearch vs. Algolia
I evaluated two main contenders: Meilisearch and Algolia. Having worked with Meilisearch before, I was comfortable with its capabilities and had experience self-hosting it. But here’s the thing about side projects: you want to focus on building, not managing infrastructure.
Algolia’s cloud pricing caught my attention. Their generous free tier and pay-as-you-go model aligned perfectly with Lavafoshi’s usage patterns. More importantly, their AI features like personalized recommendations opened doors for future enhancements I was excited to explore.
Architecture deep dive: The unified index approach
The first major architectural decision was how to structure our indices. Algolia gives you two main options:
Option 1: Separate indices for each content type
songs_index, albums_index, artists_index, playlists_index
Option 2: Unified index with type discrimination
music_search_index (with type: song|album|artist|playlist)
I chose the unified approach, and here’s why this decision proved crucial:
User experience comes first
When users search for “Ali Rameez”, they don’t want to see just songs or just albums, they want everything. A unified index delivers mixed results in a single, fast query, exactly like Spotify does.
Simplified ranking logic
With separate indices, you’d need to maintain different ranking rules for each content type, then somehow merge and re-rank results on your backend. With a unified index, one set of ranking rules handles everything consistently.
Resource efficiency
One index means one set of settings to configure, one batch of data to sync, and one endpoint to optimize. As your search requirements evolve, you’re making changes in one place.
Here’s the foundational structure I built:
type SearchableItem struct {
ObjectID string `json:"objectID"`
Type string `json:"type"` // "song", "album", "artist", "playlist"
Name string `json:"name"`
ArtistName string `json:"artist_name,omitempty"`
AlbumName string `json:"album_name,omitempty"`
Duration int32 `json:"duration,omitempty"`
MonthlyListeners int32 `json:"monthly_listeners"`
FollowersCount int32 `json:"followers_count"`
HasLyrics bool `json:"has_lyrics,omitempty"`
ArtworkURL string `json:"artwork_url"`
Biography string `json:"biography,omitempty"`
ReleasedDate string `json:"released_date,omitempty"`
}
The Power of Denormalization
Notice how I’m storing artist_name
and album_name
directly in each song record, rather than just foreign key references. This is denormalization—trading storage space for query performance.
In a traditional database, this would be heresy. But search indices aren’t databases. They’re optimized for read performance, and the cost of a few extra megabytes is negligible compared to the performance gains.
Service Architecture: Keeping It Clean
Since Lavafoshi already had a solid service-oriented architecture, integrating Algolia needed to feel natural. I organized the code like this:
├── internal/services/algolia.go # Core Algolia service
├── internal/handlers/search.go # HTTP API handlers
├── internal/repository/ # Existing data access layer
└── cmd/indexer/ # CLI indexing tool
The key insight here is separation of concerns. The Algolia service handles search index operations, the handlers manage HTTP requests, and the CLI tools handle bulk operations. Each component has a single responsibility, making the codebase easier to maintain and test.
Content-Specific Modeling
Each content type gets its own optimized structure. Here’s how I modeled songs:
type SearchableSong struct {
ObjectID string `json:"objectID"`
ID uint64 `json:"id"`
Name string `json:"name"`
ArtistName string `json:"artist_name"`
AlbumName string `json:"album_name"`
Duration int32 `json:"duration"`
MonthlyListeners int32 `json:"monthly_listeners"`
HasLyrics bool `json:"has_lyrics"`
ArtworkURL string `json:"artwork_url"`
Type string `json:"type"` // Always "song"
}
The Type
field is crucial—it enables faceted search, allowing users to filter results by content type. The ObjectID
follows Algolia’s convention of being unique across all records in the index.
Search Configuration: The Art of Relevance
Here’s where things get interesting. Algolia’s ranking algorithm is highly configurable, but with great power comes the need for thoughtful decisions. After analyzing how users interact with music search, I developed this ranking strategy:
settings := search.Settings{
SearchableAttributes: opt.SearchableAttributes(
"name", "artist_name", "album_name", "biography",
),
AttributesForFaceting: opt.AttributesForFaceting(
"type", "has_lyrics", "released_date",
),
Ranking: opt.Ranking(
"desc(monthly_listeners)", // Popularity first
"desc(followers_count)", // Then social proof
"typo", // Typo tolerance
"geo", "words", "filters", // Algolia defaults
"proximity", "attribute", "exact", "custom",
),
CustomRanking: opt.CustomRanking(
"desc(monthly_listeners)", "desc(followers_count)",
),
}
Why Popularity First?
In music search, popularity often correlates with relevance. When someone searches for “Shape”, they probably want Ed Sheeran’s “Shape of You”, not an obscure jazz track with “shape” in the lyrics. By putting monthly_listeners
first in the ranking, popular tracks naturally bubble to the top.
This doesn’t mean unpopular music gets buried—Algolia’s algorithm balances all ranking factors. It just means that when relevance is equal, popularity acts as the tiebreaker.
The importance of faceting
Faceting allows users to filter results dynamically. By making type
, has_lyrics
, and released_date
facetable, users can refine their search to show only songs with lyrics or only albums from the 2000s. This dramatically improves the search experience for power users while keeping it simple for casual browsers.
Building the CLI Indexer: Batch Processing Done Right
Initial data indexing and ongoing maintenance required a robust CLI tool. Here’s how I approached batch processing at scale:
go run ./cmd/indexer -type=all -batch-size=1000
The Batch Processing Strategy
func syncSongs(ctx context.Context, repo repository.SongRepository,
algoliaService *services.AlgoliaService, batchSize int) error {
offset := 0
totalSynced := 0
for {
songs, err := repo.GetAll(offset, batchSize)
if err != nil {
return fmt.Errorf("failed to get songs at offset %d: %w", offset, err)
}
if len(songs) == 0 {
break // End of data
}
// Transform and index the batch
if err := algoliaService.IndexSongs(ctx, songs); err != nil {
return fmt.Errorf("failed to index batch at offset %d: %w", offset, err)
}
totalSynced += len(songs)
log.Printf("Synced %d songs (total: %d)", len(songs), totalSynced)
offset += batchSize
// Early termination for smaller final batch
if len(songs) < batchSize {
break
}
}
log.Printf("Successfully synced %d songs total", totalSynced)
return nil
}
Lessons Learned About Batch Sizes
Getting batch sizes right took some experimentation:
- Too small (< 100 records): Network overhead dominates, indexing becomes glacially slow
- Too large (> 5000 records): Memory usage spikes, potential timeout issues
- Sweet spot (1000-2000 records): Good balance of throughput and resource usage
The key insight is that batch size isn’t just about performance, it’s about reliability. Smaller batches mean that if something goes wrong, you lose less work and can resume more easily.
API Handler: The Great Migration
The biggest architectural shift happened in our search handler. The transformation from multiple SQL queries to a single Algolia call was dramatic:
Before: The SQL Approach
func (h *SearchHandler) Search(c *gin.Context) {
// Multiple database hits with manual result merging
songs := h.searchSongs(keyword) // ~150ms
artists := h.searchArtists(keyword) // ~100ms
albums := h.searchAlbums(keyword) // ~120ms
playlists := h.searchPlaylists(keyword) // ~80ms
// Manual relevance scoring and merging
results := h.combineAndRankResults(songs, artists, albums, playlists)
// Total: ~450ms + processing time
}
After: The Algolia Approach
func (h *SearchHandler) Search(c *gin.Context) {
searchOptions := services.SearchOptions{
Query: keyword,
Page: page,
HitsPerPage: hitsPerPage,
Facets: []string{"type"},
Filters: typeFilter, // Optional: "type:song"
}
results, err := h.algoliaService.Search(ctx, searchOptions)
// Total: ~25ms
if err != nil {
c.JSON(500, gin.H{"error": "Search failed"})
return
}
c.JSON(200, results)
}
The difference is stark: from four database queries plus processing time to a single API call. But the real magic is in what Algolia handles automatically—typo tolerance, relevance scoring, faceted search, and pagination all work out of the box.
Response Format: Unified Yet Flexible
I designed a response format that preserves type information while enabling unified results:
type SearchItem struct {
Type string `json:"type"`
Data interface{} `json:"data"`
}
type SearchResponse struct {
Results []SearchItem `json:"results"`
Page int `json:"page"`
HitsPerPage int `json:"hits_per_page"`
TotalHits int `json:"total_hits"`
TotalPages int `json:"total_pages"`
Facets map[string]map[string]int `json:"facets,omitempty"`
}
This structure gives frontend developers the flexibility they need. They can render all results in a unified list while applying type-specific styling and interactions based on the type
field.
Technical Challenges: The Devil in the Details
Challenge 1: Go’s Type System vs. JSON Flexibility
Algolia returns search results as map[string]interface{}
, which doesn’t play nicely with Go’s static type system. Here’s how I handled the conversion:
func (h *SearchHandler) convertHitToSong(hit map[string]interface{}) models.Song {
// JSON numbers come back as float64, need safe conversion
id, ok := hit["id"].(float64)
if !ok {
log.Printf("Warning: invalid ID for song hit: %v", hit["id"])
id = 0
}
duration, _ := hit["duration"].(float64)
monthlyListeners, _ := hit["monthly_listeners"].(float64)
name, _ := hit["name"].(string)
artistName, _ := hit["artist_name"].(string)
return models.Song{
ID: uint64(id),
Name: name,
ArtistName: artistName,
Duration: int32(duration),
MonthlyListeners: int32(monthlyListeners),
ArtworkURL: hit["artwork_url"].(string),
}
}
The key lesson here: always use type assertions defensively. JSON unmarshaling in Go can be unpredictable, especially when dealing with numbers and optional fields.
Challenge 2: Repository Pattern Integration
Our existing repositories used different patterns for data access. For indexing to work smoothly, I needed consistency:
type SongRepository interface {
GetByID(id uint64) (*models.Song, error)
GetAll(offset, limit int) ([]models.Song, error) // Added for indexing
GetRecentlyUpdated(since time.Time) ([]models.Song, error) // For incremental sync
// ... existing methods
}
Adding these methods to all repository interfaces created a consistent pattern for bulk operations, making the indexer implementation much cleaner.
Performance Results: The Numbers Don’t Lie
The migration to Algolia delivered transformational performance improvements:
Metric | Before (SQL) | After (Algolia) | Improvement |
---|---|---|---|
Average Response Time | 200-500ms | 15-50ms | 10x faster |
P95 Response Time | 1200ms | 80ms | 15x faster |
Search Features | Basic LIKE matching | Typo tolerance, synonyms, faceting | Immeasurable |
Operational Considerations: Keeping It Running
The Indexing Strategy Evolution
I started with a simple approach but learned to be more sophisticated:
Phase 1: Full reindex everything
# Daily full reindex - simple but wasteful
0 2 * * * cd /app && ./indexer -type=all
Phase 2: Smart incremental updates
# Hourly incremental sync for new/updated content
0 * * * * cd /app && ./indexer -type=all -since=1h
Phase 3: Near real-time (planned) Event-driven updates via webhooks when content changes.
The lesson here is to start simple and evolve. A daily full reindex might seem inefficient, but it’s reliable and easy to reason about. Once you understand your data patterns and growth rate, you can optimize for efficiency.
Monitoring and Alerting
Search infrastructure is critical infrastructure. I monitor for:
- Index freshness (when was the last successful sync?)
- Search response times (are queries getting slower?)
- Error rates (are searches failing?)
- Index size growth (am I approaching plan limits?)
// Simple health check endpoint
func (h *SearchHandler) HealthCheck(c *gin.Context) {
stats, err := h.algoliaService.GetIndexStats()
if err != nil {
c.JSON(500, gin.H{"status": "unhealthy", "error": err.Error()})
return
}
c.JSON(200, gin.H{
"status": "healthy",
"index_size": stats.RecordCount,
"last_updated": stats.LastUpdated,
})
}
Lessons learned: What I wish I’d known
The Do’s
✅ Start with a unified index: The complexity of managing multiple indices isn’t worth the theoretical benefits for most use cases.
✅ Invest heavily in data modeling: Spend time thinking about what fields to index, how to structure your data, and what attributes to make searchable. This is where 80% of your search quality comes from.
✅ Build comprehensive CLI tools early: You’ll need them more than you think for debugging, maintenance, and data migrations.
✅ Use typed converters religiously: Go’s type system is your friend, even when dealing with interface{}
from JSON APIs.
✅ Plan for growth from day one: Design your indexing strategy to handle 10x your current data volume.
The Don’ts
❌ Don’t guess at batch sizes: Profile your indexing process and find the sweet spot through measurement, not intuition.
❌ Don’t index sensitive data: Algolia isn’t encrypted at rest by default. Keep user passwords, personal information, and sensitive metadata out of your search index.
❌ Don’t ignore operational costs: Search queries cost money. Implement reasonable rate limiting and monitoring to avoid surprise bills.
❌ Don’t forget about data freshness: Users notice when search results are stale. Plan your sync strategy carefully and monitor it religiously.
❌ Don’t optimize prematurely: Start with simple approaches (daily full reindex) and optimize based on real usage patterns, not theoretical concerns.
Looking forward: The next chapter
With the core search infrastructure solid, I’m excited about what comes next for Lavafoshi:
Real-time indexing
Moving from batch updates to event-driven, real-time indexing. When an artist releases a new song, it should be searchable within seconds, not hours.
Personalization at Scale
Algolia’s AI features open up possibilities for personalized search results based on listening history, geographical trends, and collaborative filtering.
Advanced Analytics
Understanding how users search provides insights into content gaps, trending artists, and feature opportunities. Search analytics can drive both product and content strategy.
Final thoughts: Beyond just speed
Migrating from SQL to Algolia was about more than just performance. It was about building a foundation for the future. The old search was a bottleneck that limited how users could discover content. The new search is an enabler that opens up new interaction patterns and user behaviors.
The technical implementation matters, but what matters more is understanding your users’ mental model of search. They expect it to “just work”—to find what they’re looking for regardless of typos, to surface popular content naturally, and to respond instantly.
Building that experience requires the right combination of technology, architecture, and operational discipline. Algolia provided the technology foundation, clean service architecture made integration seamless, and careful attention to operational details ensures it keeps working reliably.
If you’re building any kind of content discovery experience, don’t make the same mistake I did eight years ago. SQL LIKE
queries might seem simple, but they’re a technical debt that compounds over time. Invest in proper search infrastructure early—your users (and your future self) will thank you.
The complete implementation, from CLI indexing tools to API handlers and search configuration, provides a solid blueprint for any developer looking to build world-class search functionality. Sometimes the best technical decisions are the ones that let you focus on building features instead of fighting infrastructure.
And that’s exactly what good search should be—invisible infrastructure that enables great user experiences.