Tonmoy Talukder

Backend Engineer

NLP Researcher

Dhaka, Bangladesh

I'm a backend-focused full-stack engineer and NLP researcher building scalable infrastructure, distributed systems, and low-resource AI solutions. My engineering work emphasizes correctness, reliability, observability, and scale — from job queues and database design to APIs powering production workloads, including a project serving 30K–50K requests per day.

Alongside engineering, I research low-resource language NLP and Computer Vision, with publications at LREC 2026 and BIM 2023. My long-term goal is to build reliable, scalable infrastructure that bridges advanced AI research with real-world products.

About me View ResumeOpen to Backend / Infra roles · 2026

30K–50K

daily API requests

research publications

500+

LeetCode solved

1y 3m

production systems

Flagship Project

Full Case Study →

Distributed System

KaryaFlow — Distributed Job Queue

Case Study →

Designed a multi-tenant async job processing system with exactly-once semantics, priority queues, and horizontal scaling. Handles 40K+ daily jobs with P99 latency under 200ms.

40K+

Daily Jobs

<200ms

P99 Latency

99.99%

Exactly-Once

GoRedisKubernetesPostgreSQLOpenTelemetry

Case Study

All case studies →

Skills & Tech Stack

Core Engineering

GoTypeScriptPythonSQL

Backend Systems & APIs

MicroservicesRESTgRPCKafkaRabbitMQRedisChiEnt ORM

Data Stores & Analytics

PostgreSQLMongoDBClickHouseMinIO

Cloud, DevOps & Observability

GCPDockerKubernetesCI/CDPrometheusGrafanaOpenTelemetryGit

ML & AI

PyTorchHugging FaceOpenCVScikit-learnCLIP

Frontend Engineering

ReactNext.jsTailwind CSSFrontend Architecture

Publications

All publications →

LREC 2026

Bangla Key2Text: Text Generation from Keywords for a Low Resource Language

Tonmoy Talukder, G M Shahariar

Abstract

This paper introduces Bangla Key2Text, a large-scale dataset of 2.6 million Bangla keyword–text pairs designed for keyword-driven text generation in a low-resource language. The dataset is constructed using a BERT-based keyword extraction pipeline applied to millions of Bangla news texts, transforming raw articles into structured keyword–text pairs suitable for supervised learning. To establish baseline performance on this new benchmark, we fine-tune two sequence-to-sequence models, mT5 and BanglaT5, and evaluate them using multiple automatic metrics and human judgments. Experimental results show that task-specific fine-tuning substantially improves keyword-conditioned text generation in Bangla compared to zero-shot large language models. The dataset, trained models, and code are publicly released to support future research in Bangla natural language generation and keyword-to-text generation tasks.

PDF Code & Dataset Presentation

Citation bib

@article{talukder2026bangla,
  title={Bangla Key2Text: Text Generation from Keywords for a Low Resource Language},
  author={Talukder, Tonmoy and Shahariar, GM},
  journal={arXiv preprint arXiv:2604.19508},
  year={2026}
}

Transfer learning for under-resourced language processingResources for low-resource languageLow-resource methods for NLP

BIM 2023

Rank your summaries: Enhancing Bengali text summarization via ranking-based approach

G M Shahariar *, Tonmoy Talukder *, Rafin Alam Khan Sotez, Md Tanvir Rouf Shawon

* denotes equal contribution; names are listed in alphabetical order.

Abstract

With the increasing need for text summarization techniques that are both efficient and accurate, it becomes crucial to explore avenues that enhance the quality and precision of pre-trained models specifically tailored for summarizing Bengali texts. When it comes to text summarization tasks, there are numerous pre-trained transformer models at one's disposal. Consequently, it becomes quite a challenge to discern the most informative and relevant summary for a given text among the various options generated by these pre-trained summarization models. This paper aims to identify the most accurate and informative summary for a given text by utilizing a simple but effective ranking-based approach that compares the output of four different pre-trained Bengali text summarization models. The process begins by carrying out preprocessing of the input text that involves eliminating unnecessary elements such as special characters and punctuation marks. Next, we utilize four pre-trained summarization models to generate summaries, followed by applying a text ranking algorithm to identify the most suitable summary. Ultimately, the summary with the highest ranking score is chosen as the final one. To evaluate the effectiveness of this approach, the generated summaries are compared against human-annotated summaries using standard NLG metrics such as BLEU, ROUGE, BERTScore, WIL, WER, and METEOR. Experimental results suggest that by leveraging the strengths of each pre-trained transformer model and combining them using a ranking-based approach, our methodology significantly improves the accuracy and effectiveness of the Bengali text summarization.

PDF Code & Dataset Presentation

Citation bib

@article{shahariar2023rank,
  title={Rank Your Summaries: Enhancing Bengali Text Summarization via Ranking-based Approach},
  author={Shahariar, GM and Talukder, Tonmoy and Sotez, Rafin Alam Khan and Shawon, Md Tanvir Rouf},
  journal={arXiv preprint arXiv:2307.07392},
  year={2023}
}

BengaliText SummarizationSummaryTextRankTransformersRankingBERTmT5

Recent News

All news →

May 2026

Conference

Presenting at LREC 2026: Bangla Key2Text — keyword-driven text generation for a low-resource language.

Feb 2026

Publication

Paper accepted at LREC 2026: Bangla Key2Text — text generation from keywords for a low-resource language.

Jan 2026

Leadership

Appointed as project lead for BSFIC Store, a government-sector multi-mill inventory management platform for sugar mills.

Dec 2025

Role

Promoted to Software Engineer at WorldTech. Developing backend infrastructure for core API platform.

Latest Writing

All posts →

FeaturedApr 2026 · 10 min read

Rate Limiting in Distributed Systems: Token Bucket vs Sliding Window

A deep dive into the two most common rate limiting algorithms — when to use each, how they behave under burst traffic, and how to implement them with Redis.

GoRedisDistributed Systems

Mar 2026 · 12 min read

Let's build something great together.

Whether you're looking for a backend engineer who thinks in systems, or a researcher who builds production-grade tools — I'd love to connect.

Contact Me LinkedIn GitHub

Tonmoy Talukder

Flagship Project

KaryaFlow — Distributed Job Queue

Case Study

Bangla VQA Research Pipeline

Event Streaming & Audit Log

API Gateway Rate Limiter