79562031

Date: 2025-04-08 12:31:57
Score: 4.5
Natty: 4
Report link

Can someone please guide me on how to convert a PyTorch .ckpt model to a Hugging Face-supported format so that I can use it with pre-trained models?

The model I'm trying to convert was trained using PyTorch Lightning, and you can find it here:
🔗 hydroxai/pii_model_longtransfomer_version

I need to use this model with the following GitHub repository for testing:
🔗 HydroXai/pii-masker

I tried using Hugging Face Spaces to convert the model to .safetensors format. However, the resulting model produces poor results and triggers several warnings.

These are the warnings I'm seeing:



Some weights of the model checkpoint at /content/pii-masker/pii-masker/output_model/deberta3base_1024 were not used when initializing DebertaV2ForTokenClassification: ['deberta.head.lstm.bias_hh_l0', 'deberta.head.lstm.bias_hh_l0_reverse', 'deberta.head.lstm.bias_ih_l0', 'deberta.head.lstm.bias_ih_l0_reverse', 'deberta.head.lstm.weight_hh_l0', 'deberta.head.lstm.weight_hh_l0_reverse', 'deberta.head.lstm.weight_ih_l0', 'deberta.head.lstm.weight_ih_l0_reverse', 'deberta.output.bias', 'deberta.output.weight', 'deberta.transformers_model.embeddings.LayerNorm.bias', 'deberta.transformers_model.embeddings.LayerNorm.weight', 'deberta.transformers_model.embeddings.token_type_embeddings.weight', 'deberta.transformers_model.embeddings.word_embeddings.weight', 'deberta.transformers_model.encoder.layer.0.attention.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.0.attention.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.0.attention.output.dense.bias', 'deberta.transformers_model.encoder.layer.0.attention.output.dense.weight', 'deberta.transformers_model.encoder.layer.0.attention.self.key.bias', 'deberta.transformers_model.encoder.layer.0.attention.self.key.weight', 'deberta.transformers_model.encoder.layer.0.attention.self.key_global.bias', 'deberta.transformers_model.encoder.layer.0.attention.self.key_global.weight', 'deberta.transformers_model.encoder.layer.0.attention.self.query.bias', 'deberta.transformers_model.encoder.layer.0.attention.self.query.weight', 'deberta.transformers_model.encoder.layer.0.attention.self.query_global.bias', 'deberta.transformers_model.encoder.layer.0.attention.self.query_global.weight', 'deberta.transformers_model.encoder.layer.0.attention.self.value.bias', 'deberta.transformers_model.encoder.layer.0.attention.self.value.weight', 'deberta.transformers_model.encoder.layer.0.attention.self.value_global.bias', 'deberta.transformers_model.encoder.layer.0.attention.self.value_global.weight', 'deberta.transformers_model.encoder.layer.0.intermediate.dense.bias', 'deberta.transformers_model.encoder.layer.0.intermediate.dense.weight', 'deberta.transformers_model.encoder.layer.0.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.0.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.0.output.dense.bias', 'deberta.transformers_model.encoder.layer.0.output.dense.weight', 'deberta.transformers_model.encoder.layer.1.attention.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.1.attention.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.1.attention.output.dense.bias', 'deberta.transformers_model.encoder.layer.1.attention.output.dense.weight', 'deberta.transformers_model.encoder.layer.1.attention.self.key.bias', 'deberta.transformers_model.encoder.layer.1.attention.self.key.weight', 'deberta.transformers_model.encoder.layer.1.attention.self.key_global.bias', 'deberta.transformers_model.encoder.layer.1.attention.self.key_global.weight', 'deberta.transformers_model.encoder.layer.1.attention.self.query.bias', 'deberta.transformers_model.encoder.layer.1.attention.self.query.weight', 'deberta.transformers_model.encoder.layer.1.attention.self.query_global.bias', 'deberta.transformers_model.encoder.layer.1.attention.self.query_global.weight', 'deberta.transformers_model.encoder.layer.1.attention.self.value.bias', 'deberta.transformers_model.encoder.layer.1.attention.self.value.weight', 'deberta.transformers_model.encoder.layer.1.attention.self.value_global.bias', 'deberta.transformers_model.encoder.layer.1.attention.self.value_global.weight', 'deberta.transformers_model.encoder.layer.1.intermediate.dense.bias', 'deberta.transformers_model.encoder.layer.1.intermediate.dense.weight', 'deberta.transformers_model.encoder.layer.1.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.1.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.1.output.dense.bias', 'deberta.transformers_model.encoder.layer.1.output.dense.weight', 'deberta.transformers_model.encoder.layer.10.attention.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.10.attention.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.10.attention.output.dense.bias', 'deberta.transformers_model.encoder.layer.10.attention.output.dense.weight', 'deberta.transformers_model.encoder.layer.10.attention.self.key.bias', 'deberta.transformers_model.encoder.layer.10.attention.self.key.weight', 'deberta.transformers_model.encoder.layer.10.attention.self.key_global.bias', 'deberta.transformers_model.encoder.layer.10.attention.self.key_global.weight', 'deberta.transformers_model.encoder.layer.10.attention.self.query.bias', 'deberta.transformers_model.encoder.layer.10.attention.self.query.weight', 'deberta.transformers_model.encoder.layer.10.attention.self.query_global.bias', 'deberta.transformers_model.encoder.layer.10.attention.self.query_global.weight', 'deberta.transformers_model.encoder.layer.10.attention.self.value.bias', 'deberta.transformers_model.encoder.layer.10.attention.self.value.weight', 'deberta.transformers_model.encoder.layer.10.attention.self.value_global.bias', 'deberta.transformers_model.encoder.layer.10.attention.self.value_global.weight', 'deberta.transformers_model.encoder.layer.10.intermediate.dense.bias', 'deberta.transformers_model.encoder.layer.10.intermediate.dense.weight', 'deberta.transformers_model.encoder.layer.10.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.10.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.10.output.dense.bias', 'deberta.transformers_model.encoder.layer.10.output.dense.weight', 'deberta.transformers_model.encoder.layer.11.attention.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.11.attention.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.11.attention.output.dense.bias', 'deberta.transformers_model.encoder.layer.11.attention.output.dense.weight', 'deberta.transformers_model.encoder.layer.11.attention.self.key.bias', 'deberta.transformers_model.encoder.layer.11.attention.self.key.weight', 'deberta.transformers_model.encoder.layer.11.attention.self.key_global.bias', 'deberta.transformers_model.encoder.layer.11.attention.self.key_global.weight', 'deberta.transformers_model.encoder.layer.11.attention.self.query.bias', 'deberta.transformers_model.encoder.layer.11.attention.self.query.weight', 'deberta.transformers_model.encoder.layer.11.attention.self.query_global.bias', 'deberta.transformers_model.encoder.layer.11.attention.self.query_global.weight', 'deberta.transformers_model.encoder.layer.11.attention.self.value.bias', 'deberta.transformers_model.encoder.layer.11.attention.self.value.weight', 'deberta.transformers_model.encoder.layer.11.attention.self.value_global.bias', 'deberta.transformers_model.encoder.layer.11.attention.self.value_global.weight', 'deberta.transformers_model.encoder.layer.11.intermediate.dense.bias', 'deberta.transformers_model.encoder.layer.11.intermediate.dense.weight', 'deberta.transformers_model.encoder.layer.11.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.11.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.11.output.dense.bias', 'deberta.transformers_model.encoder.layer.11.output.dense.weight', 'deberta.transformers_model.encoder.layer.2.attention.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.2.attention.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.2.attention.output.dense.bias', 'deberta.transformers_model.encoder.layer.2.attention.output.dense.weight', 'deberta.transformers_model.encoder.layer.2.attention.self.key.bias', 'deberta.transformers_model.encoder.layer.2.attention.self.key.weight', 'deberta.transformers_model.encoder.layer.2.attention.self.key_global.bias', 'deberta.transformers_model.encoder.layer.2.attention.self.key_global.weight', 'deberta.transformers_model.encoder.layer.2.attention.self.query.bias', 'deberta.transformers_model.encoder.layer.2.attention.self.query.weight', 'deberta.transformers_model.encoder.layer.2.attention.self.query_global.bias', 'deberta.transformers_model.encoder.layer.2.attention.self.query_global.weight', 'deberta.transformers_model.encoder.layer.2.attention.self.value.bias', 'deberta.transformers_model.encoder.layer.2.attention.self.value.weight', 'deberta.transformers_model.encoder.layer.2.attention.self.value_global.bias', 'deberta.transformers_model.encoder.layer.2.attention.self.value_global.weight', 'deberta.transformers_model.encoder.layer.2.intermediate.dense.bias', 'deberta.transformers_model.encoder.layer.2.intermediate.dense.weight', 'deberta.transformers_model.encoder.layer.2.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.2.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.2.output.dense.bias', 'deberta.transformers_model.encoder.layer.2.output.dense.weight', 'deberta.transformers_model.encoder.layer.3.attention.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.3.attention.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.3.attention.output.dense.bias', 'deberta.transformers_model.encoder.layer.3.attention.output.dense.weight', 'deberta.transformers_model.encoder.layer.3.attention.self.key.bias', 'deberta.transformers_model.encoder.layer.3.attention.self.key.weight', 'deberta.transformers_model.encoder.layer.3.attention.self.key_global.bias', 'deberta.transformers_model.encoder.layer.3.attention.self.key_global.weight', 'deberta.transformers_model.encoder.layer.3.attention.self.query.bias', 'deberta.transformers_model.encoder.layer.3.attention.self.query.weight', 'deberta.transformers_model.encoder.layer.3.attention.self.query_global.bias', 'deberta.transformers_model.encoder.layer.3.attention.self.query_global.weight', 'deberta.transformers_model.encoder.layer.3.attention.self.value.bias', 'deberta.transformers_model.encoder.layer.3.attention.self.value.weight', 'deberta.transformers_model.encoder.layer.3.attention.self.value_global.bias', 'deberta.transformers_model.encoder.layer.3.attention.self.value_global.weight', 'deberta.transformers_model.encoder.layer.3.intermediate.dense.bias', 'deberta.transformers_model.encoder.layer.3.intermediate.dense.weight', 'deberta.transformers_model.encoder.layer.3.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.3.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.3.output.dense.bias', 'deberta.transformers_model.encoder.layer.3.output.dense.weight', 'deberta.transformers_model.encoder.layer.4.attention.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.4.attention.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.4.attention.output.dense.bias', 'deberta.transformers_model.encoder.layer.4.attention.output.dense.weight', 'deberta.transformers_model.encoder.layer.4.attention.self.key.bias', 'deberta.transformers_model.encoder.layer.4.attention.self.key.weight', 'deberta.transformers_model.encoder.layer.4.attention.self.key_global.bias', 'deberta.transformers_model.encoder.layer.4.attention.self.key_global.weight', 'deberta.transformers_model.encoder.layer.4.attention.self.query.bias', 'deberta.transformers_model.encoder.layer.4.attention.self.query.weight', 'deberta.transformers_model.encoder.layer.4.attention.self.query_global.bias', 'deberta.transformers_model.encoder.layer.4.attention.self.query_global.weight', 'deberta.transformers_model.encoder.layer.4.attention.self.value.bias', 'deberta.transformers_model.encoder.layer.4.attention.self.value.weight', 'deberta.transformers_model.encoder.layer.4.attention.self.value_global.bias', 'deberta.transformers_model.encoder.layer.4.attention.self.value_global.weight', 'deberta.transformers_model.encoder.layer.4.intermediate.dense.bias', 'deberta.transformers_model.encoder.layer.4.intermediate.dense.weight', 'deberta.transformers_model.encoder.layer.4.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.4.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.4.output.dense.bias', 'deberta.transformers_model.encoder.layer.4.output.dense.weight', 'deberta.transformers_model.encoder.layer.5.attention.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.5.attention.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.5.attention.output.dense.bias', 'deberta.transformers_model.encoder.layer.5.attention.output.dense.weight', 'deberta.transformers_model.encoder.layer.5.attention.self.key.bias', 'deberta.transformers_model.encoder.layer.5.attention.self.key.weight', 'deberta.transformers_model.encoder.layer.5.attention.self.key_global.bias', 'deberta.transformers_model.encoder.layer.5.attention.self.key_global.weight', 'deberta.transformers_model.encoder.layer.5.attention.self.query.bias', 'deberta.transformers_model.encoder.layer.5.attention.self.query.weight', 'deberta.transformers_model.encoder.layer.5.attention.self.query_global.bias', 'deberta.transformers_model.encoder.layer.5.attention.self.query_global.weight', 'deberta.transformers_model.encoder.layer.5.attention.self.value.bias', 'deberta.transformers_model.encoder.layer.5.attention.self.value.weight', 'deberta.transformers_model.encoder.layer.5.attention.self.value_global.bias', 'deberta.transformers_model.encoder.layer.5.attention.self.value_global.weight', 'deberta.transformers_model.encoder.layer.5.intermediate.dense.bias', 'deberta.transformers_model.encoder.layer.5.intermediate.dense.weight', 'deberta.transformers_model.encoder.layer.5.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.5.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.5.output.dense.bias', 'deberta.transformers_model.encoder.layer.5.output.dense.weight', 'deberta.transformers_model.encoder.layer.6.attention.output.LayerNorm.bias', 'deberta.transformers_model.encoder.layer.6.attention.output.LayerNorm.weight', 'deberta.transformers_model.encoder.layer.6.attention.output.dense.bias', 'deberta.transformers_model.encoder.layer.6.attention.output.dense.weight', 'deberta.transformers_model.encoder.layer.6.attention.self.key.bias', 'deberta.transformers_model.encoder.layer.6.attention.self.key.weight', 'deberta.transformers_model.encoder.layer.6.attention.self.key_global.bias', 'deberta.transformers_model.encoder.layer.6.attention.self.key_global.weight', 'deberta.transformers_model.encoder.layer.6.attention.self.query.bias', 'deberta.transformers_model.encoder.layer.6.attention.self.query.weight', 'deberta.transformers_model.encoder.layer.6.attention.self.query_global.bias', 'deberta.transformers_model.encoder.layer.6.attention.self.query_global.weight', 'deberta.transformers_model.encoder.layer.6.attention.self.value.bias', 'deberta.transformers_model.encoder.layer.6.attention.self.value.weight', 'deberta.transformers_model.encoder.layer.6.attention.self.value_global.bias', 'deberta.transformers_model.encoder.layer.6.attention.self.............'deberta.encoder.layer.9.output.dense.bias', 'deberta.encoder.layer.9.output.dense.weight', 'deberta.encoder.rel_embeddings.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Reasons:
  • Blacklisted phrase (1): guide me
  • Blacklisted phrase (0.5): I need
  • RegEx Blacklisted phrase (2.5): Can someone please guide me
  • Long answer (-1):
  • Has code block (-0.5):
  • Contains question mark (0.5):
  • Starts with a question (0.5): Can someone please
  • Low reputation (1):
Posted by: Mahnoor Syeda