# GPepT: A Foundation Language Model for Peptidomimetics Incorporating Noncanonical Amino Acids

https://mdr.nims.go.jp/datasets/9d81f75d-cc04-410f-ae94-4f7edd013889

## Download

- [oikawa-et-al-2025-gpept-a-foundation-language-model-for-peptidomimetics-incorporating-noncanonical-amino-acids.pdf](https://mdr.nims.go.jp/filesets/4d4fc3cf-62ba-4bdb-8f9d-312509beb24d/download)
- [ml5c00375_si_002.pdf](https://mdr.nims.go.jp/filesets/113a3bf7-4f62-43c1-840a-fcb08be15b5f/download)

## Id

9d81f75d-cc04-410f-ae94-4f7edd013889

## Local identifier



## Visibility

open_to_public

## State

published

## Created at

2025-08-24T23:44:10.397703Z

## Updated at

2025-08-25T03:30:37.601483Z

## Published at

2025-08-25T03:19:24.601583Z

## Doi



## First published url

https://doi.org/10.1021/acsmedchemlett.5c00375

## Date published

2025-07-22

## Recorded date published



## Resource type

journal_article

## Manuscript type

vor

## Collection



## Title

- title: 'GPepT: A Foundation Language Model for Peptidomimetics Incorporating Noncanonical
    Amino Acids'
  title_type: original
  lang: en

## Description

- description: 'Language models have been increasingly popular in therapeutic peptide
    generation, but molecular diversity remains limited due to reliance on the 20
    canonical amino acids. We propose a language model that generates peptidomimetics
    incorporating noncanonical elements like noncanonical amino acids and terminal
    modifications. To accomplish this, we created a vocabulary of over 17,000 noncanonical
    elements by extracting them from chemical formulas stored in the ChEMBL database.
    Our pretrained language model, GPepT, showed improved diversity in molecular structures
    and chemical properties. To demonstrate its real-world application, we fine-tuned
    the model for antimicrobial peptides. Experimental validation revealed that one
    of the generated peptidomimetics exhibited effective antimicrobial activity, marking
    a successful case of AI-driven peptide development. GPepT is fully accessible
    on HuggingFace: https://huggingface.co/Playingyoyo/GPepT.'
  description_type: abstract
  lang: und

## Creator

- name: Yuna Oikawa
  role: author
- name: Takanori Uzawa
  role: author
  orcid: https://orcid.org/0000-0001-6042-513X
- name: Francois Berenger
  role: author
- name: Noriko Minagawa
  role: author
- name: Akiko Yumoto
  role: author
- name: Hideaki Takaku
  role: author
- name: Ryo Tamura
  role: author
  orcid: https://orcid.org/0000-0002-0349-358X
- name: Yoshihiro Ito
  role: author
  orcid: https://orcid.org/0000-0002-1154-253X
- name: Koji Tsuda
  role: author
  orcid: https://orcid.org/0000-0002-4288-1606

## Contact agent



## Publisher

organization: American Chemical Society (ACS)

## Managing organization



## Keyword

- subject: Language model
  schema: not_defined
- subject: amino acid
  schema: not_defined

## Rights

- identifier: https://creativecommons.org/licenses/by-nc-nd/4.0/

## Other identifier(s)



## Data origin

- data_origin_type: other

## Embargo



## Journal

- title: ACS Medicinal Chemistry Letters
  issn: '19485875'
  volume: '16'
  issue: '8'
  article_number: acsmedchemlett.5c00375

## Conference



## Related item



## Funding

- identifier: JPMJCR21O2
  funder_name: Core Research for Evolutional Science and Technology
- identifier: JPMJER1903
  funder_name: Exploratory Research for Advanced Technology
- identifier: JPMXP1122712807
  funder_name: Agency for Cultural Affairs, Government of Japan

## Instrument



## Instrument operator



## Instrument managing organization



## Measurement method



## Specimen



## Chemical composition



## Structure for specimen



## Structural feature for specimen



## Specific property for specimen



## Process for specimen treatment



## Computational method



## Energy level/transition state



## Software



## Custom property



## Fileset

- id: 4d4fc3cf-62ba-4bdb-8f9d-312509beb24d
  filename: oikawa-et-al-2025-gpept-a-foundation-language-model-for-peptidomimetics-incorporating-noncanonical-amino-acids.pdf
  content_type: application/pdf
  size: 3143262
  md5: 8c9010c3c40927ce4017035729a2670f
- id: 113a3bf7-4f62-43c1-840a-fcb08be15b5f
  filename: ml5c00375_si_002.pdf
  content_type: application/pdf
  size: 861600
  md5: 4b8b5c1d675e6f6ab6597fec8f14b565

## Thumbnail

fileset_id: 4d4fc3cf-62ba-4bdb-8f9d-312509beb24d
filename: oikawa-et-al-2025-gpept-a-foundation-language-model-for-peptidomimetics-incorporating-noncanonical-amino-acids.pdf