Indian Flag
Government Of India
A-
A
A+

Bhashini - IndicNER

IndicNER is a multilingual Named Entity Recognition model fine-tuned on 11 Indian languages to identify named entities in text

About Model

IndicNER is a state-of-the-art multilingual Named Entity Recognition (NER) model developed by Bhashini. It is designed to recognize and classify named entities such as names of persons, organizations, locations, dates, and more from text in 11 Indian languages:
Hindi, Bengali, Tamil, Telugu, Gujarati, Punjabi, Marathi, Assamese, Kannada, Malayalam and Oriya.

Training Dataset:
The model is fine-tuned using a large corpus derived from publicly available Indian NER datasets and human-annotated test sets, ensuring high accuracy across different languages. Additionally, it has been trained on data sourced from the Samanantar Corpus, India's largest parallel corpus, to enhance its contextual understanding. The base model used for fine-tuning is BERT-base-multilingual-uncased, which allows it to capture linguistic nuances effectively.

Use Cases:
IndicNER can be used for a wide range of Natural Language Processing (NLP) applications, including:

1. Automated document processing – Extracting key entities from government, legal, and business documents.
2. Chatbots and virtual assistants – Enhancing conversational AI by identifying user queries related to people, places, and organizations.
3. News and content analysis – Automatically tagging and categorizing entities in multilingual news articles.
4. Healthcare and medical records – Identifying patient details and medical terms for structured data extraction.

For more details and implementation, visit: https://huggingface.co/ai4bharat/IndicNER.



Bhashini - IndicNER

Metadata Metadata

MIT

AI4Bharat

Named Entity Recognition (NER) Model

Other

Open

Sector Agnostic

05/03/25 15:23:12

Admin

591.28 MB

Activity Overview Activity Overview

  • Downloads0
  • Downloads 9
  • Views 435
  • File Size 591.12 MB

Tags Tags

  • Multilingual
  • Foreigners
  • NLP
  • Transformer
  • Token Classification
  • Pytorch
  • Samanantar
  • Bert
  • NER

License Control License Control

MIT

Version Control Version Control

FolderVersion 2(591.12 MB)
  • admin·1 year(s) ago
  • No File(s) Found!

More Models from Daffodil Softwares Pvt. More Models from Daffodil Softwares Pvt.

Bhashini-AI4Bharat Textual Language Detection v1.0
Detect language from provided text, Currently supports 23 languages (English, Bangla, Manipuri, Bodo, Konkani, Oriya, Nepali, Marathi, Sindhi, Sanskrit, Malayalam, Urdu, Assamese, Telugu, Dogri, Gujarati, Kashmiri, Punjabi, Santali, Maithili, Hindi, Tamil, Kannada)
NLP
Multilingual
AI4Bharat
Text data
Text Language Detection
Transformer
Deep Learning
Text Processing
Bhashini
  • See Upvoters0
  • Downloads72
  • File Size3 MB
  • Views857
Updated 11 month(s) ago

DIGITAL INDIA BHASHINI DIVISION

Indic Trans2
AI4Bharat's Indic-Trans-v2 is a multilingual Transformer (~1.1BM) NMT model trained on Samanantar v2 dataset which is the largest publicly available parallel corpora collection for languages of India at the time of writing (23 March 2023). We currently release two models - Indic to English and English to Indic and support all the 22 scheduled languages of India.
Machine Translation
Language Modeling
Bilingual Translation
Multilingual Translation
Machine Translation
Regional Languages
Computational Linguistics
NLP
Indic-TransV2
Indian Languages
  • See Upvoters0
  • Downloads16
  • File Size214.60 KB
  • Views318
Updated 1 year(s) ago

DIGITAL INDIA BHASHINI DIVISION

Indic-Conformer model for ASR
Indo-Aryan Indic-Conformer is a multilingual speech model for North-Indian languages. This model is based on Conformer large architecture, with 115M parameters.
Speech Processing
Bhashini
Automatic Speech Recognition
Speech Technology
Speech Lab
  • See Upvoters0
  • Downloads13
  • File Size64.91 KB
  • Views430
Updated 1 year(s) ago

DIGITAL INDIA BHASHINI DIVISION

IndicXlit
A Transformer-based multilingual transliteration model
Regional Languages
Indian Languages
NLP
transliteration
Language Modeling
Multilingual Translation
Machine Translation
  • See Upvoters0
  • Downloads6
  • File Size3.94 MB
  • Views253
Updated 1 year(s) ago

DIGITAL INDIA BHASHINI DIVISION

Bhashini - Fastspeech2 Model using (HS)
Text-to-speech models trained using FastPitch and HiFi-GAN vocoder, separately for each language. Supports both 'female' and 'male' voices.
Transformer
Text to Speech
Text Processing
NLP
Multilingual
Language Detection
  • See Upvoters0
  • Downloads10
  • File Size286.72 MB
  • Views377
Updated 1 year(s) ago

DIGITAL INDIA BHASHINI DIVISION

Bhashini - IndicNER
IndicNER is a multilingual Named Entity Recognition model fine-tuned on 11 Indian languages to identify named entities in text
Multilingual
Foreigners
NLP
Transformer
Token Classification
Pytorch
Samanantar
Bert
NER
  • See Upvoters0
  • Downloads9
  • File Size591.28 MB
  • Views436
Updated 1 year(s) ago

DIGITAL INDIA BHASHINI DIVISION