Machine Learning Diaries
Subscribe
Sign in
Home
Notes
Archive
About
Latest
Top
Discussions
Transformer Layers as Painters
Understanding how Transformer's layers actually function
Mar 30
•
Priyanka Nath
3
Share this post
Machine Learning Diaries
Transformer Layers as Painters
Copy link
Facebook
Email
Notes
More
Introducing Model Soups - How to Increase Accuracy of Fine Tuned LLMs Without Increasing Inference Time ?
Brief summary of the paper on Model soups: Averaging weights of multiple fine-tuned models improves accuracy without increasing inference time
Mar 13
•
Priyanka Nath
1
Share this post
Machine Learning Diaries
Introducing Model Soups - How to Increase Accuracy of Fine Tuned LLMs Without Increasing Inference Time ?
Copy link
Facebook
Email
Notes
More
November 2024
Learn from Experiences of Experts - Running Trustworthy A/B Test
The definitive practical guide to A/B tests summarizing experience of experts
Nov 27, 2024
•
Priyanka Nath
4
Share this post
Machine Learning Diaries
Learn from Experiences of Experts - Running Trustworthy A/B Test
Copy link
Facebook
Email
Notes
More
1
Super Weights in LLMs - How Pruning Them Destroys a LLM's Ability to Generate Text ?
Super weights are crucial to performance of LLMs and can have outsized impact on LLM model's behaviour
Nov 18, 2024
•
Priyanka Nath
2
Share this post
Machine Learning Diaries
Super Weights in LLMs - How Pruning Them Destroys a LLM's Ability to Generate Text ?
Copy link
Facebook
Email
Notes
More
Introducing Etalon: How we choose a LLM with optimal Runtime Performance ?
How to evaluate LLMs and identify best LLM Inference System
Nov 11, 2024
•
Priyanka Nath
1
Share this post
Machine Learning Diaries
Introducing Etalon: How we choose a LLM with optimal Runtime Performance ?
Copy link
Facebook
Email
Notes
More
September 2024
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Improving performance of LLM models using series of prompts that mimics how we humans solve complex problems
Sep 9, 2024
•
Priyanka Nath
1
Share this post
Machine Learning Diaries
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Copy link
Facebook
Email
Notes
More
February 2024
How to apply Boosting when the Data Labels are Noisy and Uncertain ?
LocalBoost - Local Boosting for Weakly-Supervised Learning
Feb 28, 2024
•
Priyanka Nath
Share this post
Machine Learning Diaries
How to apply Boosting when the Data Labels are Noisy and Uncertain ?
Copy link
Facebook
Email
Notes
More
January 2024
Everything you need to know about identifying hallucinations by LLMs
“Why is this misleading?”: Detecting News Headline Hallucinations with Explanations
Jan 2, 2024
•
Priyanka Nath
4
Share this post
Machine Learning Diaries
Everything you need to know about identifying hallucinations by LLMs
Copy link
Facebook
Email
Notes
More
September 2023
DiffGrad : Is it the right optimization method for training your CNNs?
Learn about DiffGrad - optimizer that solves the overshooting problem of Adam
Sep 25, 2023
•
Priyanka Nath
Share this post
Machine Learning Diaries
DiffGrad : Is it the right optimization method for training your CNNs?
Copy link
Facebook
Email
Notes
More
January 2023
What is Ant Colony Optimization and how does it help?
Deep dive into mechanics and applications of Ant Colony Optimization
Jan 26, 2023
•
Sheryl Bellary
1
Share this post
Machine Learning Diaries
What is Ant Colony Optimization and how does it help?
Copy link
Facebook
Email
Notes
More
December 2022
A Look Into The Emerging Domain of Metric Learning
All about metric Learning and the impact it makes on the world of computer vision
Dec 21, 2022
•
Joshua Raj
4
Share this post
Machine Learning Diaries
A Look Into The Emerging Domain of Metric Learning
Copy link
Facebook
Email
Notes
More
OptFormer: Google's Improved Hyperparameter Optimization Technique
Optimizers with Transformers
Dec 16, 2022
•
Sheryl Bellary
5
Share this post
Machine Learning Diaries
OptFormer: Google's Improved Hyperparameter Optimization Technique
Copy link
Facebook
Email
Notes
More
Share
Copy link
Facebook
Email
Notes
More
This site requires JavaScript to run correctly. Please
turn on JavaScript
or unblock scripts