An extensive tutorial on hydrogen bonds for computer beginners, including Python exercises.
On this page
What is a Hydrogen Bond?
Imagine you have two magnets. They aren’t glued together, but they feel a strong pull toward each other. A hydrogen bond is a bit like that, but at the molecular level.
It happens when a hydrogen atom, which is already “married” (covalently bonded) to a very greedy atom like Oxygen (O) or Nitrogen (N), feels an attraction to another greedy atom nearby.
- The Donor: The atom the hydrogen is already bonded to (e.g., Nitrogen in an amino acid).
- The Acceptor: The nearby greedy atom that pulls on the hydrogen (e.g., Oxygen in another amino acid).
In proteins, these “invisible strings” are what hold the structure together, like the rungs of a ladder or the folds of a dress.
Hydrogen bonds are weak individually, but they are strong in numbers! Think of Velcro: one tiny hook does nothing, but thousands of them can hold a jacket closed.
Why do we care?
Without hydrogen bonds:
- DNA would fall apart (the two strands are held together by H-bonds).
- Proteins would just be long, useless strings instead of 3D machines.
- Water would be a gas at room temperature (H-bonds keep water molecules “sticky” enough to be liquid).
Python Exercise: Finding H-Bonds
If you are new to programming, Python is like writing a recipe for a computer. We are going to write a script that:
- Downloads a protein structure.
- Looks for Nitrogen (N) and Oxygen (O) atoms.
- Calculates the distance between them.
- If they are close enough (around 2.5 to 3.5 Ångströms), we call it a potential hydrogen bond!
1. Setup
First, we need to install a tool called ProDy that helps Python understand protein files (PDB files).
uv init hbond-projectcd hbond-projectuv add prody2. The Python Script
Create a file named find_hbonds.py and paste the following code. Don’t worry if it looks scary; we will explain it below!
from prody import parsePDBimport numpy as np
# 1. Download and parse a small protein (Ubiquitin - 1UBQ)pdb = parsePDB('1ubq')
# 2. Select only Nitrogen (Donors) and Oxygen (Acceptors)# In a real scenario, we'd check if H is attached, but distance is a good start!donors = pdb.select('element N')acceptors = pdb.select('element O')
print(f"Searching for H-bonds between {len(donors)} Nitrogens and {len(acceptors)} Oxygens...")
# 3. Define a simple distance functiondef calculate_distance(p1, p2): return np.sqrt(np.sum((p1 - p2)**2))
count = 0# 4. Loop through every donor and every acceptorfor d in donors: for a in acceptors: # Avoid checking an atom against itself (though N and O are different elements) if d.getIndex() == a.getIndex(): continue
dist = calculate_distance(d.getCoords(), a.getCoords())
# 5. If distance is between 2.5 and 3.2 Angstroms, it's likely an H-bond if 2.5 <= dist <= 3.2: print(f"Match! {d.getResname()}{d.getResnum()} (N) --- {a.getResname()}{a.getResnum()} (O) | Distance: {dist:.2f} Å") count += 1
print(f"\nFound {count} potential Hydrogen Bonds!")3. Understanding the Code
parsePDB('1ubq'): This tells the computer to go to the internet, find the protein “1ubq”, and bring it into Python.pdb.select('element N'): We are filtering the thousands of atoms to only look at Nitrogens.- The “Nested Loop” (
for d in donors: for a in acceptors:): This is the computer version of “Check every Nitrogen against every Oxygen, one by one.” dist <= 3.2: Scientists have found that for a hydrogen bond to exist, the atoms usually need to be this close.
Challenge for You
- Change the Protein: Try changing
'1ubq'to'1ace'(Acetylcholinesterase). Does it find more or fewer bonds? - Adjust the Threshold: What happens if you change
3.2to4.0? (Hint: You’ll find many more “matches,” but some might be too far to be real bonds). - Specific Residue: Can you modify the code to only find H-bonds for a specific residue, like
resnum 10?
In professional biology, we use more complex rules (including the angle of the bond), but measuring distance is the first step every bioinformatician learns!