An extensive tutorial on the hydrophobic effect and interactions for computer beginners, including Python exercises with ProDy.
On this page
What is the Hydrophobic Effect?
If you’ve ever tried to mix oil and water to make a salad dressing, you’ve seen the hydrophobic effect in action. No matter how much you shake the bottle, the oil eventually clumps back together, separating itself from the water.
In Greek, Hydro means “water” and Phobos means “fear.” So, “hydrophobic” literally means “water-fearing.”
In a protein:
- Hydrophilic (water-loving) parts of the protein like to be on the outside, touching the water.
- Hydrophobic (water-fearing) parts like to hide on the inside, away from the water.
This is the “driving force” that makes a long stringy protein fold up into a tight, functional ball.

The “Crowded Party” Analogy
Imagine a room full of people who all love to talk to each other (water molecules). They are constantly shaking hands and chatting (forming hydrogen bonds).
Now, imagine a few people walk in who don’t speak the same language and don’t want to talk to anyone (hydrophobic molecules).
- To keep the conversation flowing, the “talkers” will naturally push the “non-talkers” into the center of the room or into a corner so they don’t get in the way of the handshakes.
- The “non-talkers” aren’t actually attracted to each other; they are just pushed together by the water molecules who want to hang out with each other!
The hydrophobic effect is unique because it’s not about a “pull” between atoms. It’s about entropy—the water molecules “prefer” the hydrophobic parts to stay out of the way so the water can be more free and messy.
Water has unique properties that enable it to serve as the universal solvent.
Specifically, water molecules possess an electric dipole, in which the oxygen atom carries a partial negative charge, whereas both hydrogen atoms carry a partial positive charge.
As a result, individual water molecules tend to hydrogen-bond with each other in a way that connects their dipoles into one large network.
This property is manifested in the high surface tension and boiling temperature of bulk water. Although the water molecules are interconnected, bulk water is dynamic; the individual molecules tend to detach from and re-attract to the network rapidly. This increases the inherent disorder of bulk water, or in other words, its entropy.
Python Exercise: Finding the Hydrophobic Core
In this exercise, we will use ProDy to identify the hydrophobic residues in a protein and see if they are actually “hiding” in the center (the core).
1. Setup
If you haven’t already, install the necessary tools:
uv init hydrophobic-projectcd hydrophobic-projectuv add prody numpy2. The Python Script
Create a file named find_hydrophobic.py. We will look at Ubiquitin (1UBQ) and identify common hydrophobic amino acids: Isoleucine (ILE), Leucine (LEU), and Valine (VAL).
from prody import parsePDBimport numpy as np
# 1. Download and parse Ubiquitinpdb = parsePDB('1ubq')
# 2. Define hydrophobic residues# These are the "non-talkers" who hide from water.hydrophobic_query = 'resname ILE LEU VAL PHE MET'hydrophobic_atoms = pdb.select(hydrophobic_query)
print(f"Found {len(hydrophobic_atoms)} atoms in hydrophobic residues.")
# 3. Calculate the Center of the Protein# We can find the average position (centroid) of all atoms.all_coords = pdb.getCoords()protein_center = np.mean(all_coords, axis=0)print(f"Protein Center (Geometric): {protein_center}")
# 4. Check distance from center for each hydrophobic residue# We'll look at Alpha Carbons (CA) to represent the residue position.hydrophobic_ca = pdb.select(f'({hydrophobic_query}) and name CA')
distances = []for atom in hydrophobic_ca: dist = np.linalg.norm(atom.getCoords() - protein_center) distances.append(dist) # print(f"Residue {atom.getResname()}{atom.getResnum()} is {dist:.2f} Å from center")
avg_dist_hydro = np.mean(distances)
# 5. Compare with Hydrophilic residues (like Lysine or Glutamate)hydrophilic_ca = pdb.select('resname LYS GLU ASP ARG and name CA')distances_philic = []for atom in hydrophilic_ca: dist = np.linalg.norm(atom.getCoords() - protein_center) distances_philic.append(dist)
avg_dist_philic = np.mean(distances_philic)
print("\n--- Results ---")print(f"Average distance of Hydrophobic residues from center: {avg_dist_hydro:.2f} Å")print(f"Average distance of Hydrophilic residues from center: {avg_dist_philic:.2f} Å")
if avg_dist_hydro < avg_dist_philic: print("\nSuccess! The hydrophobic residues are closer to the center on average.")else: print("\nInteresting! In this small protein, the separation might be less obvious.")3. Understanding the Code
resname ILE LEU VAL PHE MET: These are the codes for the most hydrophobic amino acids.np.mean(all_coords, axis=0): This finds the “middle” of the protein by averaging the X, Y, and Z coordinates of every atom.np.linalg.norm(...): This is a fancy way of saying “calculate the straight-line distance between two points.”- The Comparison: We expect hydrophobic residues to have a smaller average distance from the center because they are “buried” inside.
Challenge for You
- Add more residues: Research which other amino acids are hydrophobic (like Tryptophan - TRP or Alanine - ALA) and add them to the
hydrophobic_query. - Visualization: If you have Protein - PyMOL installed, try this command:
show spheres, resn ILE+LEU+VAL+PHE+MET. You will see them forming a “cluster” in the middle! - Find “Surface” Hydrophobics: Sometimes hydrophobic residues are on the outside (often to help two proteins stick together). Can you find the residue that is the farthest from the center in your list?