Menengah Exercism • go

Isogram

Ringkasan Tantangan

Determine if a word or phrase is an isogram.

Instructions

Determine if a word or phrase is an isogram.

An isogram (also known as a “non-pattern word”) is a word or phrase without a repeating letter, however spaces and hyphens are allowed to appear multiple times.

Examples of isograms:

lumberjacks
background
downstream
six-year-old

The word isograms, however, is not an isogram, because the s repeats.

Dig Deeper

map

`map`

// Package isogram is a small library for analyzing if a phrase is an isogram.
package isogram

import (
	"unicode"
)

type blank struct{}

// IsIsogram accepts a phrase and returns if it is an isogram.
func IsIsogram(phrase string) bool {
	lookup := make(map[rune]blank)
	for _, chr := range phrase {
		if !unicode.IsLetter(chr) {
			continue
		}
		ltr := unicode.ToLower(chr)
		if _, exists := lookup[ltr]; exists {
			return false
		}
		lookup[ltr] = blank{}
	}
	return true
}

This approach starts by defining the type blank as an empty struct. The lookup map is defined with a rune as the key and blank (i.e. an empty struct) as the value.

By assigning values of an empty struct to the `map`, it essentially makes the keys what is known as a "set" in other languages.
A set is a collection of unique values.
More info on implementing a set in Go can be found [here](https://yourbasic.org/golang/implement-set/).

The ways to iterate characters are by Unicode runes, or by each letter being a string, or by each letter being a byte. The runes are from range on a string, the strings from Split(), and the bytes from indexing into the string. Another way to iterate runes is to convert the string to a rune slice and range on it. The difference between ranging on a rune slice vs ranging on a string is that the index returned from a string is the position of the next rune in bytes, not which rune it is. For example, if the first unicode character is two bytes, then the second unicode character index will be 2 when ranging on a string and 1 when ranging on a rune slice. In this approach we don’t care about the index, so ranging on a string is fine.

unicode.IsLetter() is used to ignore the character if it is not a Unicode letter.
unicode.ToLower() is used to ensure the letter is lowercase. The occurrence of the letter a and the letter A count as a repeated letter, so Alpha would not be an isogram. Lowercasing A to a makes it easier to check for the repeated a.
An if with a short statement is used to both assign the exists variable and to test if it’s true. If it is true, then the letter has been used before, and the function returns false. If it is not true, the control falls through to the next statement, which creates the letter key and assigns it the empty struct value.

If the loop finishes without returning false, then no letters are repeated, and the function returns true.

strings.Count()

`strings.Count()`

// Package isogram is a small library for analyzing if a phrase is an isogram.
package isogram

import (
	"strings"
	"unicode"
)

// IsIsogram accepts a phrase and returns if it is an isogram.
func IsIsogram(phrase string) bool {
	phrase = strings.ToLower(phrase)
	for _, chr := range phrase {
		if !unicode.IsLetter(chr) {
			continue
		}
		if strings.Count(phrase, string(chr)) > 1 {
			return false
		}
	}
	return true
}

The steps for this solution are

strings.ToLower() produces a new string with the characters of the passed-in string changed to lower case. The occurrence of the letter a and the letter A count as a repeated letter, so Alpha would not be an isogram. Lowercasing Alpha to alpha makes it easier to check for the repeated a.

unicode.IsLetter() is used to ignore the character if it is not a Unicode letter.
The letter rune is converted to a string to be passed to strings.Count(). strings.Count() is used to get the number of occurrences of the letter in the entire string.
If the letter occurs more than once, then the function returns false.

If the loop finishes without returning false, then no letters are repeated, and the function returns true.

Bit field

// Package isogram is a small library for analyzing if a phrase is an isogram.
package isogram

// IsIsogram accepts a phrase and returns if it is an isogram.
func IsIsogram(phrase string) bool {
	theEnd := len(phrase)
	var letterFlags uint32 = 0

	for i := range theEnd {
		letter := phrase[i]
		if letter > 96 && letter < 123 {
			if (letterFlags & (1 << (letter - 'a'))) != 0 {
				return false
			} else {
				letterFlags |= (1 << (letter - 'a'))
			}
		} else if letter > 64 && letter < 91 {
			if (letterFlags & (1 << (letter - 'A'))) != 0 {
				return false
			} else {
				letterFlags |= (1 << (letter - 'A'))
			}
		}
	}
	return true
}

This solution uses the ASCII value of the letter to set the corresponding bit position.

The ways to iterate characters are by Unicode runes, or by each letter being a string, or by each letter being a byte. The runes are from range on a string, the strings from Split(), and the bytes from indexing into the string. Another way to iterate runes is to convert the string to a rune slice and range on it. The difference between ranging on a rune slice vs ranging on a string is that the index returned from a string is the position of the next rune in bytes, not which rune it is. For example, if the first unicode character is two bytes, then the second unicode character index will be 2 when ranging on a string and 1 when ranging on a rune slice. As of the time of this writing we can iterate bytes, since all of the characters are ASCII.

The string is looped through its characters, looking for a character being a through z or A through Z.

The ASCII value for a is 97, and for z is 122.
The ASCII value for A is 65, and for Z is 90.
If the lower-cased letter is subtracted by a, then a will result in 0, because 97 minus 97 equals 0. z would result in 25, because 122 minus 97 equals 25. So a would have 1 shifted left 0 places (so not shifted at all) and z would have 1 shifted left 25 places.
If the upper-cased letter is subtracted by A, then A will result in 0, because 65 minus 65 equals 0. Z would result in 25, because 90 minus 65 equals 25. So A would have 1 shifted left 0 places (so not shifted at all) and Z would have 1 shifted left 25 places.

In that way, both a lower-cased z and an upper-cased Z can share the same position in the bit field.

So, for an unsigned thirty-two bit integer, if the values for a and Z were both set, the bits would look like

      zyxwvutsrqponmlkjihgfedcba
00000010000000000000000000000001

We can use the bitwise AND operator to check if a bit has already been set. If it has been set, we know the letter is duplicated and we can immediately return false. If it has not been set, we can use the bitwise OR operator to set the bit. If the loop completes without finding a duplicate letter (and returning false), the function returns true.

This approach is fast, but it only works with the ASCII alphabet.

Community Solutions

Lihat Solusi Komunitas

The following are the top 3 community solutions by stars:

Solution 1 by @onlyann (⭐ 181) — View on Exercism
Solution 2 by @Jarak-Jakar (⭐ 7) — View on Exercism
Solution 3 by @AMiiR (⭐ 0) — View on Exercism

Source: Exercism go/isogram

💡 Lihat Solusi

"strings"
	"unicode"
)

func IsIsogram(s string) bool {
	s = strings.ToLower(s)

	for i, c := range s {
		if unicode.IsLetter(c) && strings.ContainsRune(s[i+1:], c) {
			return false