On Euler’s Phi Function

In which we find that Euler’s phi function was neither phi nor a function.

First of all, a shout-out to all of my math(s) friends who are at (or traveling to) the Joint Mathematics Meetings in Baltimore! Now on to some math.

In my research for the “Evolution of…” series of posts, I came across the word totient in Steven Schwartzman’s The Words of Mathematics, which got me thinking about how Euler’s φ (phi) function—also called the “totient function”—came about. The word itself isn’t that mysterious: totient comes from the Latin word tot, meaning “so many.” In a way, it’s the answer to the question Quot? (“how many”?). Schwartzman notes that the Quo/To pairing is similar to the Wh/Th paring in English (Where? There. What? That. When? Then.). So much for the etymology.

It seems to me, though, that the more interesting questions are: who first defined it? how did the notation change over time? I did some digging, and here’s what I’ve discovered.

The first stop on my investigative tour was Leonard Dickson’s History of the Theory of Numbers (1952). At the beginning of Chapter V, titled “Euler’s Function, Generalizations; Farey Series”, Dickson has two things to say about Leonhard Euler:

“L. Euler… investigated the number φ(n) of positive integers which are relatively prime to n, without then using a functional notation for φ(n).”

“Euler later used πN to denote φ(N)…”

Each of these quotations contains a footnote, the first one to Euler’s paper “Demonstration of a new method in the theory of arithmetic” (written in 1758)  and the second to “Speculations about certain outstanding properties of numbers” (written in 1775). In the first paper, Euler is more interested in proving Fermat’s little theorem, which, true to form, he had already proven twice before. However, Euler does define the phi function (on p. 76, though as Dickson says, he doesn’t use function notation), and proves some basic facts about it, including the facts that φ(pm) = pm-1(p-1) [Theorem 3] and φ(AB) = φ(A)φ(B) when and B are relatively prime [Theorem 5]. This paper is in Latin, and while we do see the use of the words totidem and tot, they don’t seem to hold any special mathematical significance.

In the second paper, Euler returns to the phi function, having decided by this time to use π to represent it. Hard-core nerd that he is, Euler provides us with a table of values of πD for D up to 100, and replicates many of the facts he proved in the first paper. It’s interesting to note that, while Euler wrote this second paper in 1775, it wasn’t published until 1784, a year after his death.

It wasn’t until 1801, in Disquisiones Arithmeticae, that Carl Gauss introduced φN to indicate the value of the totient of N. So why did he pick φ rather than Euler’s π? Well, I checked the English translation by Arthur Clarke (no not, that Arthur Clarke), and I think it’s quite likely that he chose it for no discernible reason. In Clarke’s translation, Gauss introduces φ on page 20—and Gauss loved using Greek letters. In pages 5-19 (the beginning of Section II), he uses α, β, γ, κ, λ, μ, π, δ, ε, ξ, ν, ζ — and only after these does he use φ. As to the use of π, which was Euler’s notation, it’s possible that Gauss knew of Euler’s latter work and chose φ because he had already used π, but there’s no way to know for sure. (Also, π was already used for 3.14159… by this point, but if that was his reasoning, it’s odd that he used the symbol π at all.) Most likely, he just picked another Greek letter off the top of his head. It is important to remember that at no point did Gauss use function notation for the totient—it always appears as φN, never φ(N). (Also: Gauss goes on to use Γ and τ before getting tired of Greek and moving on to the fraktur letters 𝔄, 𝔅, and 𝖅.)

The next significant change came nearly a century later in J. J. Sylvester‘s article “On Certain Ternary Cubic-Form Equations,” published in the American Journal of Mathematics in 1879. On page 361, Sylvester examines the specific case npi, and says

pi-1(p-1) is what is commonly designated as the φ function of pi, the number of numbers less than pi and prime to it (the so-called φ function of any number I shall here and hereafter designate as its τ function and call its Totient).

While Sylvester’s usage of the word totient has become commonplace, mathematicians continue to use φ instead of τ. It just goes to show that a symbol can become entrenched in the mathematical community, even if a notational change would make more sense. Also of note is the fact that while Sylvester refers to the totient as a function, he doesn’t use the modern parenthesis notation, as in τ(n), but continues in Euler and Gauss’s footsteps by using τn.

And this is where our story ends. Sylvester’s use of the word totient, Gauss’s use of the letter φ, and Euler’s original definition all contributed to the modern construct that we call the phi/totient function. Even though Euler’s original definition came in a Latin paper, it wasn’t until Sylvester that the use of totient became commonplace.

However, Euler had proven many of the basic facts about it as early as 1758. So, while the original phi function was neither phi nor a function, it was undoubtedly Euler’s.

Advertisements

The Evolution of Weights and Measures

At long last, I’ve exhausted my curiosity in mathematical etymologies. Many word histories have been explored in the previous three installments:

This time around, I want to look at some of the words we use for measurements. There are a few interesting histories in the metric system (SI), but most of the fun comes from the English Imperial system.

The Roman Empire provided us with the primary pre-SI system of measurement in Europe, from which many of the medieval systems were derived. The Latin word mille gives us two important words today: million (related to “thousand”, as detailed in a previous post), and mile. As Roman legions marched across the Mediterranean world, they measured their distances according to paces, with a thousand paces being milia passuum. A pace is the distance traveled in two full steps, and is about 58-62 inches (depending, obviously, on an individual’s height). Using this reckoning, the Roman definition of a mile clocks in at 4,833-5,167 feet.

When the Roman Empire fractured in the West, their uniform measurement system fractured as well, occasionally with hilarious consequences. Later, by the 18th century, the Roman mile had evolved from one definition to many: there were Scots miles, English miles, German miles, and so on. The German mile was 24,000-some feet (at least according to Wikipedia), compared to the English mile’s comparably-paltry 5,280 feet. (Go check that Wikipedia reference, too—there are many more variants!)

But before I get too distracted by the history of the mile, let’s move on to some other length measurements.

  • Inch — this is a fun one. The word comes from the Latin uncia, which basically means “unit”. The strange thing is that an uncia was a unit of weight rather than length—it was 1/12th of a Roman pound. While the English inch is still 1/12th of its parent measure, the ounce somehow became 1/16th of a pound.
  • Furlong — rather simply, it’s a combination of furrow and long, with a furrow being the length of a ten-acre farm field. This makes it about 1/8th of a mile.
  • Yard and Rod — these two have an intertwined history. Today, a yard is 3 feet long, and a rod is 16.5 feet long. The word yard comes from Old English gierd, meaning “rod” or “stick.” Rod comes from the Old Norse rudda, meaning “club”. According to Schwartzman, the rod and the yard were used somewhat interchangeably during the Medieval period, and only later did they settle on 3 and 16.5 feet (or thereabouts)—the “short” and the “long” yard.   
  • Fathom — originating from the Old English fæðm (“faythm”), meaning “arms” or “grasp”. It was the length of a person’s outstretched arms, and is defined as 6 feet today. Perhaps, given its nautical use, a fathom was the distance you could fall off the boat while still being rescued by someone on board?

While there are lots of other words I could choose from, here are two in particular that have a surprising connection.

  • Pound — comes from the Latin pondus, meaning “a weight.” The abbreviation lb. comes from the Latin word libra, meaning “pound” or “balance.” In most markets, merchants would assess the value of precious metals offered for payment using a balance scale (still with us in the popular imagination today). Indeed, one of the signs of the Zodiac is a balance scale. Of course, you’d need to balance the payment against a set of known weights. Over time, then, the word for the weights themselves came to be the English pound, while the word for the scale itself (libra) evolved into its abbreviation.
  • Liter — comes from the Greek litra, which was a unit of weight. Yes, libra and litra have a common origin! Schwartzman notes that lytre and pound were used interchangeably in England as late as the 17th century. When France adopted a decimal system (the precursor to modern SI units), they borrowed the word litron, changing it from a unit of weight to a unit of volume.

There are many, many more words that I didn’t have the time or energy to write up! But hopefully it’s kept your interest throughout the whole series of posts. Get a copy of Schwartzman’s The Words of Mathematics if you want to learn more. 

What weighs more: a pound of gold or a pound of feathers?

Hi everyone! I had intended to write up a full etymology post this month, but time got away from me during the holidays. So for now, I offer an amusing fact taken from Jeff Suzuki’s book, Mathematics in Historical Context

You may know the old joke “What weighs more: a pound of gold, or a pound of feathers?” The answer, of course, is that a pound is the same regardless of what’s being weighed. However, this was not the case in the Medieval world! While the Romans imposed some uniformity of measurement on most of Europe, by Medieval times individual communities had developed their own variations. This bring us to Suzuki:

The complexity of the system of weights and measures is most obvious in what seems to be a nonsensical question: which weighs more, a pound of gold or a pound of feathers? Gold and other precious commodities were measured in Troy units, named after the semiannual trade fairs at Troyes in Champagne, France, where goods from throughout Europe could be exchanged. The Troy pound is divided into twelve troy ounces, and each ounce into twenty pennyweights, and each pennyweight into 24 grains: thus, a Troy pound is equal to 12 x 20 x 24 = 5760 grains. An avoirdupois pound (from the French “having weight”) was defined as having a weight of 7000 grains: thus a pound of gold (5760 grains) weighed less than a pound of feathers (7000 grains). Even more confusingly, the avoirdupois pound was divided into 16 inappropriately named ounces, so an ounce of gold (20 x 24 = 480 grains) was heavier than an ounce of feathers (7000 ÷ 16 = 437.5).

That’s it for now! I will return in the new year with one last post, on the origins of our words for weights and measures.

Calendars, Cycles, and Cool Coincidences (Part II)

This is my second post on the alignment of Thanksgiving and Hanukkah. Go back and read the first post, if you haven’t done so.

When compared to the Julian or Gregorian calendar, the Hebrew calendar is a different animal entirely. First of all, it is not a solar calendar, but is rather a lunisolar calendar. This means that while the years are kept in alignment with the solar year, the months are reckoned according to the motion of the moon. In ancient days, the start of the month was tied to the sighting of the new moon. Eventually, the Jewish people (and more specifically, the rabbis) realized that it would be better for the calendar to rely more on mathematical principles. Credit typically goes to Hillel II, who lived in the 300s CE. In the description that follows, I will be using Dershowitz and Reingold’s Calendrical Calculations as my primary source, with assistance from Tracy Rich’s Jew FAQ page.

The typical Jewish year contains 12 months of 29 or 30 days each, and is often 354 days long. (See how I worded that? It matters.) Clearly, this is significantly shorter than the solar year, so some adjustments are necessary. Specifically, there is a leap year for 7 of every 19 years. But instead of adding a leap day, the Hebrew calendar goes right ahead and adds an entire month (Adar II), which adds 30 days to the length of the year. Mathematically, you can figure out if year y is a leap year by calculating (7y+1) mod 19—if the answer is < 7, then y is a leap year. In the current year, 5774, the calculation is 7*5774+1 = 40419 = 6 (mod 19), so it’s a leap year. With just this fact, the average length of the year appears to be 365.053—about 4 1/2 hours fast. At a minimum, the leap months explain how Jewish holidays move through the Gregorian calendar: since the typical year is 354 days, a holiday will move earlier and earlier each year, until a leap month occurs, at which point it will snap back to a later date. (Next year, Hanukkah will be on 17 December.)

But it’s not as simple as all that. Owing to the lunar origins of the Hebrew calendar, the beginning of the new year is determined by the occurrence of the new moon (called the molad) in the month of Tishrei (the Jewish New Year, Rosh Hashanah, is on 1 Tishrei). Owing to the calendar reforms of Hillel II, this has become a purely mathematical process. Basically, you take a previously calculated molad and use the average length of the moon’s cycle to calculate the molad for any future month. Adding a wrinkle to this calculation is the fact that the ancient Jews used a timekeeping system in which the day had 24 hours and each hour was divided into 1080 “parts”. (So, one part = 3 1/3 seconds.) In this system, the average length of a lunar cycle is estimated as 29d 12h 793p. While this estimate is many centuries old, it is incredibly accurate—the average synodic period of the moon is 29d 12h 792.86688p, a difference of less than half a second.

Once the molad of Tishrei has been calculated, there are 4 postponement rules, called the dechiyot, which add another layer to the calculation:

  1. If the molad occurs late in the day (12pm or 6pm depending on your source) Rosh Hashanah is postponed by a day.
  2. Rosh Hashanah cannot occur on a Sunday, Wednesday, or Friday. If so, it gets postponed by a day.
  3. The year is only allowed to be 353-355 days long (or 383-385 days in a leap year). The calculations for year y can have the effect of making year y+1 too long, in which case Rosh Hashanah in year y will get postponed to avoid this problem.
  4. If year y-1 is a leap year, and Rosh Hashanah for year y is on a Monday, the year y-1 may be too short. Rosh Hashanah for year y needs to get postponed a day.

As someone who’s relatively new to the Hebrew calendar, all of this was very confusing to me. For one thing, it’s not clear that rules 3 and 4 will really keep the length of the year in the correct range. For another, it’s not clear what you’d do with the “extra” days that are inserted or removed. Here’s how I think of it: the years in the Hebrew calendar don’t live in arithmetical isolation, but are designed to be elastic. You can stretch or shrink adjacent years by a day or two so that the start of each year begins on an allowable day. When a year needs to be stretched, a leap day is included at the end of the month of Cheshvan. When a year needs to be shrunk, an “un-leap” day is removed from the end of Kislev.

Now here’s the question my mathematician’s soul wants to answer: How long is the period for the Hebrew calendar? This might seem an impossible question in light of all the postponement rules, but it turns out that each block of 19 years will have exactly the same length: 6939d 16h 595p, or 991 weeks with a remainder of 69,715 parts. As with the Julian calendar, the days of the week don’t match from block to block, so we need to use the length of a week (181,440 parts) and find the least common multiple. Using parts as the basic unit of measurement, we have:

lcm(69715, 181440) = 2,529,817,920 parts ≈ 689,472 years.

Wow! We can also calculate the “combined period” of the Hebrew and Gregorian calendars, to see how frequently they will align exactly. Writing the average year lengths as fractions, the calculation is:

lcm(689472*(365+24311/98496), 400*(365+97/400)) = 5,255,890,855,047 days = 14,390,140,400 Gregorian years = 14,389, 970,112 Hebrew years.

For comparison, the age of the universe is about 13,730,000,000 years. So while particular dates can align more frequently (for instance, Thanksgivukkah last occurred in 1888), the calendars as a whole won’t ever realign again. However, I suppose that claim depends on your view of the expansion of the universe!

Calendars, Cycles, and Cool Coincidences (Part I)

You might have heard that Hanukkah and Thanksgiving coincide this year. More specifically, you may have heard that the first day of Hanukkah (25 Kislev in the Hebrew calendar) coincides with 28 November, which just happens to be the fourth Thursday of the month. Somewhere along the way, a few clever marketers dubbed this day “Thanksgivukkah”, and America has responded: the LA Times has a recipe for “turbrisket”, kids in South Florida have been designing “menurkeys”, and Zazzle.com has a line of Thanksgivnukkah greeting cards. Christine Byrne has assembled an entire Thanksgivukkah menuI, for one, am enjoying all of the portmanteaus. (Speaking of which, have you heard of Franksgiving?)

But in addition to being a fan of portmanteaus, I’m also a fan of calendars. Some weeks ago, I began to hear from various sources that Hanukkah won’t line up with Thanksgiving for another 70,000 years or so. This got me curious, so I started researching the question myself. It turns out that the relationship between the two holidays has been examined on at least three blogs over the past few years. The first were the Lansey brothers in 2010, followed by  Stephen Morse in 2012 and Jonathan Mizrahi in January of this year. Morse’s post includes a “When Did?” page with a Javascript calendar program, and Eli Lansey kindly includes a Mathematica notebook to help the math-inclined to do the computations themselves.

Morse reports that Thanksgiving will again occur on the first day of Hanukkah in the year 79,043, while Mizrahi says it’ll be in the year 79,811. Mizrahi, by his own admission, is being cute with this number:

In all honesty, though, all of these dates are unfathomably far in the future, which was really the point [of the post].

In this post, I won’t go into exactly how the ≈79,000 number was unearthed. I will, however, sketch out some of the major features of both the Gregorian and Hebrew calendars, and how they have given rise to this strange, new holiday of Thanksgivukkah. In the end, we will find that the year 79,811 is not nearly as unfathomable as we can get.

First of all, the Western calendar as we’ve come to know it began its life as the Egyptian calendar. After the Canopus Decree in c. 238 BCE, each year in the Egyptian calendar was 365 days long, with an additional day added every 4 years. There were twelve 30-day months and five (or six) epagomenal days—days with no year or month assigned to them—to celebrate the coming of the new year. I think Pharaoh Ptolemy III said it best:

This festival is to be celebrated for 5 days: placing wreaths of flowers on their head, and placing things on the altar, and executing the sacrifices and all ceremonies ordered to be done. But that these feast days shall be celebrated in definite seasons for them to keep for ever … one day as feast of Benevolent Gods be from this day after every 4 years added to the 5 epagomenae before the new year, whereby all men shall learn, that what was a little defective in the order as regards the seasons and the year, as also the opinions which are contained in the rules of the learned on the heavenly orbits, are now corrected and improved by the Benevolent Gods.

The Egyptian model came to Rome with Julius Caesar’s calendar reforms in 46 BCE, which fixed the seriously messed up Roman calendar. It all went pretty well for the first several centuries, but there was a tiny fly in the ointment. The average length of a year in the Julian calendar is 365.25 days, while the solar year is approximately 365.24219 days long. So the Julian calendar ran slow—about 11.25 minutes per year—for 1600 years until this problem was fixed by the Gregorian calendar reforms of 1582. More specifically, Pope Gregory XIII issued a papal bull, Inter gravissimas, in which he declared that leap years would continue to occur by 4 would be leap years, except that years divisible by 100 but not by 400 would no longer be leap years. So, 1900 was not a leap year, but 2000 was. This provides an average length of 365.2425, which is only 0.00031 days (about 27 seconds) longer than the solar year. In the 431 years that have passed since the birth of the Gregorian calendar, this error has only accumulated to 3.2 hours. While the calendar isn’t perfect, it’s really quite good, especially considering that the solution amounted merely to omitting 3 leap days every 400 years. 

The key concept I’m interested in here is periodicity. In mathematics, a function is said to be periodic if it exactly repeats its values in regular intervals (or, periods). The sine function is an example of this: sin(x) = sin(x+2π) = sin(x+4π) = …, for any value of x in the interval [0,2π]. It’s very important to distinguish between a periodic function and a function that just happens to repeat some of its values. For example, the function f(x) = 1 – x2 repeats itself since f(-1) and f(1) are both equal to zero, but that doesn’t mean the function repeats itself exactly on an interval. Loosely speaking, a function is periodic when the entire curve repeats itself, not just a few select points.

We can transfer this idea to a given calendar without too much trouble:

  • A calendar’s cycle is amount of time it takes for the calendar to repeat itself exactly.
  • A calendar’s period is the amount of time it takes for the calendar to repeat itself exactly, while also taking the days of the week into account.

For consistency, it’s best to measure both the cycle and period in days, but sometimes I’ll divide by the average length of a year. For example, the Julian calendar has a cycle of 1461 days, and dividing by 365.25 gives a result of 4 years. To get the period, we need to remember that since there are 52 weeks plus 1 or 2 days in any given year, the days of the week won’t line up every 4 years. So we have to take the least common multiple to get the period: lcm(1461, 7) = 10,227 days = 28 years. For the Gregorian calendar, the cycle is 146,097 days (400 years) and the period is lcm(146097, 7) = 146097 days = 400 years—this is because 146097 happens to be a multiple of 7.

400 years is a long time, and this post has gotten pretty long, too. So I’ve broken it into two parts. Come back soon for Part II, where we will examine the mathematical labyrinth that is the Hebrew calendar…

The Evolution of Plane Curves

This is the third post in the “Evolution Of…” series; the first and second posts can be found here and here.

This time around, we’ll explore some of the words that have come to be used for various plane curves. First of all, a disclaimer: often, the names of the curves existed many centuries before the development of modern algebra and the Cartesian coordinate system. As a consequence, the original names for the curves are more geometric in origin (imagine one of the ancient Greeks saying “umm, well, it looks like a flower… so let’s call it the flower curve“).

While reviewing the curves listed in Schwartzman’s book, I noticed that most of them can be classified into four major groups: the conics, the chrones, the trixes, and the oids.

  1. Conics. You’ve probably heard of them—circle, ellipse, parabola, and hyperbola. The first one has its origins in the Latin word circus, which means “ring” or “hoop.” The other three are Greek, with their original meanings reflecting the Greeks’ use of conic sectionsEllipse comes from en (meaning “in”) and leipein (meaning “to leave out”). For the other two, note that –bola comes from ballein which means “to throw” or “to cast”. So hyperbola means “to cast over” and parabola means “to cast alongside”. (If you check out this image from Wikipedia, it may start to make more sense.)
  2. Chrones. The two curves I have in mind here are brachistochrone and tautochrone. In Greek, chrone means “time”. The prefixes come from brakhus and tauto-, which mean “short” and “same”, respectively. So these curves’ names are really “short time” and “same time.” Naturally enough, the brachistochrone is the curve on which a ball will take the least amount of time to roll down, while the tautochrone is the curve on which the time to roll down is the same regardless of the ball’s starting point. Finding equations for these curves occupied the time of many scientists and mathematicians in the 17th century, including a controversy between the brothers Jakob and Johann Bernoulli. You could say they had a “chronic” case of sibling rivalry. 
  3. Trixes. I am a fan of the feminine suffix –trix because it also provides us with the modern word obstetrics (literally, “the woman who gets in the way”—i.e., a midwife). We don’t use this suffix very much anymore, though aviatrix comes to mind. The algebraic curves in this category are trisectrix (“cut into three”) and tractrix (“the one that pulls”), along with the parabola-related term directrix (“the one that directs”). Interestingly, the masculine form of tractrix gives us the English word tractor.
  4. Oids. These were the most fun for me. The suffix is Greek, originating in oeides, which means “form” (though in modern English, “like” might be more appropriate). Here are a some examples: astroidcardioidcissoidcochleoidcycloidramphoid, strophoid. Here are their original Greek/Latin meanings: “star-like”, “heart-like”, “ivy-like”, “snail-like”, “circle-like”, “like a bird’s beak”, “(having the) form of turning”. I’ve provided images of each one below—see if you can match the name to the curve!
Curve5 Curve6 Curve1
Curve2 Curve3  
Curve4 Curve7  

Don’t worry, there are plenty more word origins coming later! However, I’ll need a break to recharge my etymology batteries. Expect an “intermission” post in the next few weeks.

The Evolution of Arithmetic

This post is the second in a series; if you haven’t read the first post, on the evolution of English counting words, I’d recommend reading that one first.

As promised, this post looks at the origins of the English words for arithmetic operations. Read on, friend!

  • Plus and Minus. These two are fairly straightforward—they’re the Latin words for “more” and “less”, respectively. The symbols, though, are less clear. It appears that the letters p and m were used (sometimes appearing as p and m) during the 1400s—Wikipedia claims that these first appeared in Luca Pacioli’s Summa de Arithmetica, though I’ve been unable to find a satisfactory example. In the 1500s, the modern + and – signs began to appear; Schwartzman attributes the + to an abbreviation of the Latin “et” (taking the t only) and the – to the bar from m.
  • Multiply. This word comes from the Latin multiplicare, meaning “to increase.” Breaking it down a little further, we have the prefix multi– (“many”) and the suffix -plex (“fold”) so that the compound word multiplex means “many folds.” (We still use “fold” language today—when we speak of a “threefold increase,” we mean that something had been multiplied by three.) The x symbol for multiplication is attributed to William Oughtred, while Schwartzman gives credit for the dot • to Gottfried Wilhelm Leibniz.
  • Divide. This word comes from Latin as well, with the origin being dividere, meaning “to separate.” (As a side note, the root videre means “to see” and gives us the modern word video, which means “I see”.) Putting di– and videre together, I suppose this means that division is literally “to see in two.”

Notice that all four of these words originate in a description of the operation itself. It turns out that exponents and roots are a little more metaphorical in their meaning:

  • Exponent. Once again, we have a Latin origin: the prefix ex– and the verb ponere, roughly meaning “to put out.” Unlike the four arithmetic operations, though, the original meaning is typographical—the exponent is the number that is “put out” above and to the right of the base. In part, it’s because the exponent is a relatively new development; Schwartzman attributes the notation to Descartes, specifically La Géométrie (1637).
  • Root. Finally, a non-Latin word! The word rot means “cause” or “origin”, which makes sense when you consider that since 8 = 23, its “origin” is 2. If you trace the word further back, the Proto-Indo-European root (see what I did there?) is wrad-. Thus, the Latin-based words radical and radish come from a source similar to root.

And there you have it! In the next installment, I’ll get a little more geometric and explore some words we’ve come to use for algebraic curves.