Kirthi Raman

Welcome to

My Interests Are Math Computers Problem Solving Algorithms Teaching Music

About Me

Experienced in the field of data engineering, data analytics, and software development, with a strong track record of delivering successful projects and driving growth. Have a reputation for being results-driven, well-organized, and detail-oriented. Areas of expertise also includes advanced machine learning and data visualization.

Have experience in product management and design, as well as training and mentoring product teams. Extensive experience with Amazon Web Services (AWS) and proficient in various technologies, including Cloudera Hadoop, Apache Spark, Python, Java, and more.

Education
M.S. Computer Science - University of Maryland, Baltimore County (1991) GPA 3.7
M.Tech Computer Science - Indian Institute of Technology, Delhi (1984) GPA 3.8
M.S. Mathematics - Indian Institute of Technology, Delhi (1982) GPA 3.98

Additional
Published book titled "Mastering Python Data Visualization" in 2015 with Packt Publishing, London.
Buy From https://www.amazon.com/Mastering-Python-Visualization-Kirthi-Raman/dp/1783988320
Member of Mathematics Stack Exchange with a reputation of 7,484 (top 4%) and 36 badges.
(view https://math.stackexchange.com/users/25538/kirthi-raman)

Cloud Experience

ETL Work using Lambda and Apache Glue.
Hive UDF to create User Defined Function. Cloudwatch Metrics and Visualization Dashboards.

Create Data Catalog by extracting data from ElasticSearch in programming language Python.

Distributed Computing

Apache Spark and Dataframes using Spark SQL and Streaming. Combining Python and Hive in a distributed environment for efficient ETL. Creating Frameworks for UI Tables and Customized Visualizations.

Writing template updates, bulk re-indexing and de-duping for ElasticSearch in Python.

Cloud Tools

Amazon EC2
Amazon DynamoDB
Amazon SageMaker
Amazon RDS
Amazon EMR
Amazon TextExtract

Amazon Memory DB for Redis
AWS Lambda
AWS Elastic Beanstalk
AWS Cloudwatch
AWS Glue
AWS Amplify and Firebase

Work Experience

CTO, Founder Onescrybe.ai

As the Chief Technology Officer and Founder of Onescrybe.ai, I lead the company’s technical strategy by integrating strong mathematical foundations with practical, production-level AI engineering. With dual master’s degrees in Mathematics and Computer Science, I combine deep theoretical expertise with hands-on experience in Python, machine learning, and scalable system design. This background enables me to architect robust AI solutions, guide technical innovation, and translate complex computational concepts into clear, strategic business value. My work focuses on building high-performance AI platforms that balance rigorous scientific principles with real-world applicability.

Consultant SME, Classified Location

In this role, I was responsible for extracting and analyzing metadata from Kibana components—including dashboards, visualizations, Lens assets, and saved searches—as well as Elasticsearch indices, policies, and transforms. I mapped fields across these assets to establish dependency relationships, enabling precise impact assessments for planned schema or field modifications. I also oversaw daily document-count monitoring to detect anomalies in data ingestion and ensure proactive alerts for operational continuity.
Additionally, I assessed and implemented large language models (LLMs) to optimize daily workflows within classified projects. My work focused on three key areas: automating code refactoring, standardizing documentation practices, and iteratively improving the architectural design of new task pipelines. These efforts streamlined processes, enhanced consistency, and accelerated the development of mission-critical solutions.

Leidos, Reston VA
Principal Data Scientist/Engineer, 2019-Current

Developed data tools, algorithms, UI Frameworks to monitor and improve business performance using Srpring/Java Framework, Pythhon/Fastapi. Served as a technical lead on large, complex architecture that involved 30-50 Terrabytes of data in ElasticSearchh. Working across multiple teams in supporting various algorithms to boost performance and handle high volumes of data efficiently.

Neustar Inc, Sterling VA
Senior Manager: Data Engineering, 2013-2018

Devised and executed sustainable, data-driven solutions using cloud and data technologies, creating and deploying end-to-end systems. Conducted code reviews, verified quality control processes, and ensured optimal performance. Orchestrated large-scale Cross Device and Match Testing and Integration (CDMTI) functions, generating an growth of $5M per year. Automated ETL process for select customers, resulting in accelerated CDMTI functions, reduced errors, and improved operational efficiency. Technologies include Cloudera Hadoop, Apache Spark, Hive, Python, D3.js, JavaScript, Jenkins, JIRA Confluence, GIT Repository, Java, Scala, R, Scikit-Learn.

Quotient Inc, Columbia MD
Principal Consultant, 2003-2013

Navigated software engineering product development, overseeing continuous improvement activities and establishing standards and best practices. Designed control system architecture, created user interfaces, and administered key tools. Led a diverse team in synthesizing clinical and FDA data for a MapReduce system, identifying productivity and improvement areas for hospitals. Collaborated with technical specialists to standardize data across its lifecycle through development and governance. Technologies used: Cloudera Hadoop, Hive, Python, D3.js, JavaScript, JIRA Confluence, GIT Repository, Java, Scikit-Learn. Applied Regression, Random Forest, Text Mining, and Natural Language Processing techniques.

Longitude Systems, Chantilly VA
Product Manager, 1999-2003

Played a critical role in managing a team of engineers in the development of multiple prducts For Provisioning System, targeting the ISP market. Designed and administered a proof of concept for venture capital purposes, successfully securing a first round of $10M funding. Contributed to the recruitment of high-performing engineers. This startup was sold to a third party software company in 2003.

Independent
Consultant, 1993-1999

Involved in the early implementation of a Search Engine that was based on a research paper at UCBerkeley.

Addditional Achievements
Published book titled "Mastering Python Data Visualization" in 2015 with Packt Publishing, London.
Volunteer: North South Foundation (Math Olympiad Coaching).
Member of Mathematics Stack Exchange with a reputation of 7,484 (top 4%) and 36 badges. (view https://math.stackexchange.com/users/25538/kirthi-raman

Technology

Interesting technologies in the 'Cloud' that stand-out are many, but to name a few that I am interested are:

NLP
Speech Recognition
Machine Learning

NLP
NLP is the process through which AI is taught to understand the rules and syntax of language, programmed to develop complex algorithms to represent those rules, and then made to use those algorithms to carry out specific tasks. These tasks can include:

Language generation: AI apps generate new text based on given prompts or contexts, such as generating text for chatbots, virtual assistants, or even creative writing.
Answering questions: AI apps respond to users who've asked a question in natural language on a specific topic.
Sentiment analysis: AI apps analyze text to determine the sentiment or emotional tone of the writer, such as whether the text expresses a positive, negative, or neutral sentiment.
Text classification: AI classifies text into different categories or topics, such as categorizing news articles into politics, sports, or entertainment.
Machine translation: AI translates text from one language to another, such as from English to Spanish.

Speech Recognition
Speech recognition, a groundbreaking technology in the realm of human-computer interaction, allows machines to interpret and convert spoken language into written text. This technology has seen significant advancements over the years, driven by machine learning techniques such as deep learning and neural networks. From voice assistants like Siri and Google Assistant to transcription services and accessibility tools, speech recognition has found applications in various domains. How Speech Recognition Works? The process of speech recognition involves several stages:

Acoustic Signal Processing: The input audio signal is transformed into a format that can be analyzed. This involves breaking down the audio into smaller segments called frames.
Feature Extraction: Features like Mel-Frequency Cepstral Coefficients (MFCCs) are extracted from each frame. These features highlight the relevant characteristics of the audio signal for subsequent analysis.
Acoustic Modeling: A trained acoustic model, often based on neural networks, learns to map the extracted features to phonemes or sub-word units. This helps in identifying the phonetic content of the speech.
Language Modeling: Language models provide context and help in understanding the sequence of words. These models consider the probability of word combinations and help in selecting the most likely words given the context.
Decoding: Using the acoustic and language models, the system decodes the most probable sequence of words that corresponds to the spoken input.

Example Code in Python using SpeechRecognition Library

Here's a simple example of speech recognition using the SpeechRecognition library in Python. Before running this code, make sure you have the library installed (pip install SpeechRecognition).

import speech_recognition as sr

# Create a recognizer object
recognizer = sr.Recognizer()

# Capture audio from the microphone
with sr.Microphone() as source:
    print("Say something...")
    audio = recognizer.listen(source)

# Perform speech recognition
try:
    text = recognizer.recognize_google(audio)
    print("You said:", text)
except sr.UnknownValueError:
    print("Sorry, I couldn't understand.")
excepti sr.RequestError as e:
    print("Error fetching results; {0}".format(e))

In this example, the code captures audio from the microphone, processes it using Google's speech recognition service, and then prints the recognized text. However, various other engines and models can be used with the SpeechRecognition library.

Challenges and Future Directions

While speech recognition has made impressive strides, challenges remain, such as handling accents, noisy environments, and complex sentence structures. Ongoing research focuses on improving accuracy and expanding language support.

As technology evolves, speech recognition is expected to play an integral role in enabling more intuitive human-computer interaction, making devices and applications more accessible and user-friendly for everyone.

Machine Learning
Machine Learning: Unleashing Intelligence Through Data
Machine Learning (ML) is a transformative field of artificial intelligence that empowers computers to learn from data and improve their performance over time. Instead of being explicitly programmed to perform tasks, machines use algorithms to learn patterns from data and make informed decisions or predictions. ML has found applications across various domains, from healthcare and finance to image recognition and recommendation systems.

Types of Machine Learning

Supervised Learning: In this approach, the model is trained on a labeled dataset where the input data is paired with the correct output. The model learns to make predictions by generalizing from the training data. Examples include image classification, spam detection, and sentiment analysis.
Unsupervised Learning: Unsupervised learning deals with unlabeled data. The model identifies patterns and structures within the data without explicit guidance. Clustering and dimensionality reduction are common tasks in unsupervised learning.
Reinforcement Learning: In reinforcement learning, an agent interacts with an environment and learns by receiving feedback in the form of rewards or penalties. The agent aims to maximize the cumulative reward over time. This is often used in robotics, game playing, and autonomous systems.

Machine Learning Examples

Image Classification with Convolutional Neural Networks (CNNs): CNNs are a type of neural network designed for image processing. They have revolutionized tasks like image classification, object detection, and facial recognition. For instance, a CNN can be trained to classify images of animals, distinguishing between cats and dogs.
Natural Language Processing (NLP) with Recurrent Neural Networks (RNNs): RNNs are used for sequence data, making them suitable for language-related tasks. Sentiment analysis, machine translation, and text generation are examples of NLP applications. A sentiment analysis model could classify movie reviews as positive or negative based on their content.
Recommendation Systems with Collaborative Filtering: Recommendation systems suggest items to users based on their preferences and behaviors. Collaborative filtering is a technique where the system recommends items based on the preferences of similar users. For instance, platforms like Netflix use collaborative filtering to suggest movies or shows to users.

Example Code in Python for Linear Regression

Linear regression is a simple yet powerful technique in supervised learning. It's used to predict a continuous output variable based on one or more input features.

importi numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt

# Sample data
X = np.array([1, 2, 3, 4, 5]).reshape(-1, 1)
y = np.array([2, 4, 5, 4, 5])

# Create a linear regression model
model = LinearRegression()

# Train the model
model.fit(X, y)

# Make predictions
predictions = model.predict(X)

# Plot the data and the regression line
plt.scatter(X, y, label='Data')
plt.plot(X, predictions, color='red', label='Regression Line')
plt.xlabel('Input')
plt.ylabel('Output')
plt.legend()
plt.show()

In this example, the code uses the scikit-learn library to create a linear regression model, train it on sample data, and visualize the data points along with the regression line.

Future of Machine Learning

The future of machine learning is promising, with advances in deep learning, reinforcement learning, and interpretability. The integration of ML in various industries is expected to drive innovation and solve complex problems by harnessing the power of data-driven intelligence.