Abhishek Srivastava’s research while affiliated with Indian Institute of Technology Delhi and other places

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (4)


Architecture of conditional hybrid GAN
Training curves of self-BLEU scores on testing dataset
Sheet music for test lyrics samples
Training curves of MMD scores on testing dataset
Sheet music for a sample lyrics at different stages of training

+4

Conditional hybrid GAN for melody generation from lyrics
  • Article
  • Publisher preview available

October 2022

·

179 Reads

·

15 Citations

Neural Computing and Applications

Yi Yu

·

·

·

[...]

·

Yi Ren

Conditional sequence generation aims to instruct the generation procedure by conditioning the model with additional context information, which is an interesting research issue in AI and machine learning. Unfortunately, current state-of-the-art generative models for music fail to generate good melodies due to the discrete-valued property of music attributes. In this paper, we propose a novel conditional hybrid GAN (C-Hybrid-GAN) for melody generation from lyrics. Three discrete sequences corresponding to music attributes, namely pitch, duration, and rest, are separately generated by melody generation model conditioned on the same lyrics. Gumbel-Softmax is used to approximate the distribution of discrete-valued samples so as to directly generate discrete melody attributes. Most importantly, a hybrid structure is proposed, which contains three independent branches (each for one melody attribute) in the generator and one branch for distinguishing concatenated attributes in the discriminator. Relational memory core is exploited to model not only the dependency inside each sequence of attribute during the training of the generator, but also the consistency among three sequences of attributes during the training of the discriminator. Through extensive experiments using evaluation metrics, e.g., maximum mean discrepancy, average rest value, and MIDI number transition, we demonstrate that the proposed C-Hybrid-GAN outperforms the existing methods in melody generation from lyrics.

View access options

Melody Generation from Lyrics Using Three Branch Conditional LSTM-GAN

March 2022

·

39 Reads

·

10 Citations

Lecture Notes in Computer Science

With the availability of paired lyrics-melody dataset and advancements of artificial intelligence techniques, research on melody generation conditioned on lyrics has become possible. In this work, for melody generation, we propose a novel architecture, Three Branch Conditional (TBC) LSTM-GAN conditioned on lyrics which is composed of a LSTM-based generator and discriminator respectively. The generative model is composed of three branches of identical and independent lyrics-conditioned LSTM-based sub-networks, each responsible for generating an attribute of a melody. For discrete-valued sequence generation, we leverage the Gumbel-Softmax technique to train GANs. Through extensive experiments, we show that our proposed model generates tuneful and plausible melodies from the given lyrics and outperforms the current state-of-the-art models quantitatively as well as qualitatively.KeywordsMelody generation from lyricsLSTMGAN


Conditional LSTM-GAN for Melody Generation from Lyrics

April 2021

·

54 Reads

·

158 Citations

ACM Transactions on Multimedia Computing, Communications and Applications

Melody generation from lyrics has been a challenging research issue in the field of artificial intelligence and music, which enables us to learn and discover latent relationships between interesting lyrics and accompanying melodies. Unfortunately, the limited availability of a paired lyrics–melody dataset with alignment information has hindered the research progress. To address this problem, we create a large dataset consisting of 12,197 MIDI songs each with paired lyrics and melody alignment through leveraging different music sources where alignment relationship between syllables and music attributes is extracted. Most importantly, we propose a novel deep generative model, conditional Long Short-Term Memory (LSTM)–Generative Adversarial Network for melody generation from lyrics, which contains a deep LSTM generator and a deep LSTM discriminator both conditioned on lyrics. In particular, lyrics-conditioned melody and alignment relationship between syllables of given lyrics and notes of predicted melody are generated simultaneously. Extensive experimental results have proved the effectiveness of our proposed lyrics-to-melody generative model, where plausible and tuneful sequences can be inferred from lyrics.


Conditional Hybrid GAN for Sequence Generation

September 2020

·

68 Reads

Conditional sequence generation aims to instruct the generation procedure by conditioning the model with additional context information, which is a self-supervised learning issue (a form of unsupervised learning with supervision information from data itself). Unfortunately, the current state-of-the-art generative models have limitations in sequence generation with multiple attributes. In this paper, we propose a novel conditional hybrid GAN (C-Hybrid-GAN) to solve this issue. Discrete sequence with triplet attributes are separately generated when conditioned on the same context. Most importantly, relational reasoning technique is exploited to model not only the dependency inside each sequence of the attribute during the training of the generator but also the consistency among the sequences of attributes during the training of the discriminator. To avoid the non-differentiability problem in GANs encountered during discrete data generation, we exploit the Gumbel-Softmax technique to approximate the distribution of discrete-valued sequences.Through evaluating the task of generating melody (associated with note, duration, and rest) from lyrics, we demonstrate that the proposed C-Hybrid-GAN outperforms the existing methods in context-conditioned discrete-valued sequence generation.

Citations (3)


... Research utilizing a single structure independently also investigates the potential of LSTMs and GANs. Yu et al. (2022) [35] also proposed a three-branch structure for modeling three independent melodic attributes. The difference is that they did not use LSTM but used a conditional hybrid GAN. ...

Reference:

AI-Enabled Text-to-Music Generation: A Comprehensive Review of Methods, Frameworks, and Future Directions
Conditional hybrid GAN for melody generation from lyrics

Neural Computing and Applications

... In (Yu et al., 2021), a syllable-level lyrics-melody paired dataset was proposed with an LSTM-GAN model addressing the lyrics-conditioned melody generation problem. Some following works also explored lyrics-to-melody generation problems based on this dataset (Yu et al., 2020;Srivastava et al., 2022;Duan et al., , 2023b. However, melody-to-lyrics generation on syllable level is a more difficult task in predicting semantic dependencies among syllable-level, word-level, and sentence-level meaning. ...

Melody Generation from Lyrics Using Three Branch Conditional LSTM-GAN
  • Citing Chapter
  • March 2022

Lecture Notes in Computer Science

... This way, they provide a way to augment the human composer rather than just replace them. In the literature, we find models that allow the generated MIDI files to be controlled based on an input emotion (Makris, Agres, and Herremans 2021), chords (Zixun, Makris, and Herremans 2021), lyrics (Yu, Srivastava, and Canales 2021), tension Guo et al. 2020), among others. Typically, these models work by feeding an input label with the desired condition, which are processed through cross attention in transformer networks, or other types of conditional neural networks. ...

Conditional LSTM-GAN for Melody Generation from Lyrics
  • Citing Article
  • April 2021

ACM Transactions on Multimedia Computing, Communications and Applications