You can implement a contrastive loss function for fine-tuning a text-generation model by encouraging similar embeddings to stay close and dissimilar ones to diverge using cosine similarity.
Here is the code snippet below:

In the above code, we are using the following key points:
-
Triplet-based setup: anchor, positive, and negative embeddings
-
Cosine similarity via dot product and normalization
-
Temperature scaling for contrastive sharpening
Hence, this approach strengthens semantic distinctions and improves representation quality in text-generation models.