Dig Deeper
String, List, and Dictionary Methods
String, List, and Dictionary Methods
def answer(question):
if not question.startswith("What is") or "cubed" in question:
raise ValueError("unknown operation")
question = question.removeprefix("What is")
question = question.removesuffix("?")
question = question.replace("by", "")
question = question.strip()
if not question:
raise ValueError("syntax error")
formula = question.split()
while len(formula) > 1:
try:
x_value = int(formula[0])
y_value = int(formula[2])
symbol = formula[1]
remainder = formula[3:]
if symbol == "plus":
formula = [x_value + y_value] + remainder
elif symbol == "minus":
formula = [x_value - y_value] + remainder
elif symbol == "multiplied":
formula = [x_value * y_value] + remainder
elif symbol == "divided":
formula = [x_value / y_value] + remainder
else:
raise ValueError("syntax error")
except:
raise ValueError("syntax error")
return int(formula[0])
Within the answer() function, the question is first checked for “unknown operations” by validating that it starts with “What is” (str.startswith, str.endswith) and does not include the word “cubed” (which is an invalid operation).
This eliminates all the current cases where a ValueError("unknown operation") needs to be raised.
Should the definition of a question expand or change, this strategy would need to be revised.
The question is then “cleaned” by removing the prefix “What is” and the suffix ”?” (str.removeprefix, str.removesuffix), replacing “by” with "" (str.replace), and stripping any leading or trailing whitespaces.
If the question is now an empty string, a ValueError("syntax error") is raised.
The remaining question string is then converted into a list of elements via str.split, and that list is iterated over using a while-loop with a len() > 1 condition.
Within a try-except block to trap/handle any errors (which will all map to ValueError("syntax error")), the question list is divided up among 4 variables using bracket notation:
- The first element,
x_value. This is assumed to be a number, so it is converted to an int()
- The third element,
y_value. This is also assumed to be a number and converted to an int().
- The second element,
symbol. This is assumed to be an operator, and is left as-is.
- The
remainder of the question, if there is any. This is a slice starting at index 3, and going to the end.
symbol is then tested for “plus, minus, multiplied, or divided”, and the formula list is modified by applying the given operation, and creating a new formula list by concatenating a list of the first product with the remainder list.
If symbol doesn’t match any known operators, a ValueError("syntax error") is raised.
Once len(formula) == 1, the first element (formula[0]) is converted to an int() and returned as the answer.
Variation 1: Use a Dictionary for Lookup/Replace
OPERATIONS = {"plus": '+', "minus": '-', "multiplied": '*', "divided": '/'}
def answer(question):
if not question.startswith("What is") or "cubed" in question:
raise ValueError("unknown operation")
question = question.removeprefix("What is").removesuffix("?").replace("by", "").strip()
if not question:
raise ValueError("syntax error")
formula = []
for operation in question.split():
formula.append(OPERATIONS.get(operation, operation))
while len(formula) > 1:
try:
x_value = int(formula[0])
y_value = int(formula[2])
symbol = formula[1]
remainder = formula[3:]
if symbol == "+":
formula = [x_value + y_value] + remainder
elif symbol == "-":
formula = [x_value - y_value] + remainder
elif symbol == "*":
formula = [x_value * y_value] + remainder
elif symbol == "/":
formula = [x_value / y_value] + remainder
else:
raise ValueError("syntax error")
except:
raise ValueError("syntax error")
return int(formula[0])
[chaining][method-chaining] is used in the clean step for this variation, and is the equivalent of assigning and re-assigning `question` as is done in the initial approach.
This is because `str.startswith`, `str.endswith`, and `str.replace` all return strings, so the output of one can be used as the input to the next.
[method-chaining]: https://www.tutorialspoint.com/Explain-Python-class-method-chaining
This variation creates a dictionary to map operation words to symbols.
It pre-processes the question string into a formula list by looking up the operation words and replacing them with the symbols via the <dict>.get method, which takes a default argument for when a KeyError is thrown.
Here the default for dict.get() is set to the element being iterated over, which is effectively “if not found, skip it”.
This means the number strings will be passed through, even though they would otherwise toss an error.
The results of iterating through the question are appended to formula via list.append.
This dictionary is not necessary, but does potentially make adding/tracking future operations easier, although the if-elif-else block in the while-loop is equally awkward for maintenance (see the import callables from operator for a way to replace the block).
The while-loop, if-elif-else block, and the try-except block are then the same as in the initial approach.
There are a couple of common alternatives to the `loop-append` used here:
1. [`list-comprehensions`][list-comprehension] duplicate the same process in a more succinct and declarative fashion. This one also includes filtering out "by":
```python
formula = [OPERATIONS.get(operation, operation) for
operation in question.split() if operation != 'by']
```
2. The built-in [`filter()`][filter] and [`map()`][map] functions used with a [`lambda`][lambdas] to process the elements of the list.
This is identical in process to both the `loop-append` and the `list-comprehension`, but might be easier to reason about for those coming from a more functional programming language:
```python
formula = list(map(lambda x : OPERATIONS.get(x, x),
filter(lambda x: x != "by", question.split())))
```
[list-comprehension]: https://docs.python.org/3/tutorial/datastructures.html#list-comprehensions
[lambdas]: https://docs.python.org/3/howto/functional.html#small-functions-and-the-lambda-expression
[filter]: https://docs.python.org/3/library/functions.html#filter
[map]: https://docs.python.org/3/library/functions.html#map
Rather than indexing and slicing, concept: unpacking and multiple assignment can be used to assign the variables.
However, this does require a modification to the returned formula list:
x_value, operation, y_value, *remainder = formula # <-- Unpacking won't allow conversion to int() here.
...
if symbol == "+":
formula = [int(x_value) + int(y_value)] + remainder # <-- Instead, conversion to int() must happen here.
...
return int(formula[0])
Variation 2: Structural Pattern Matching to Replace if-elif-else
Introduced in Python 3.10, structural pattern matching can be used to replace the if-elif-else chain in the while-loop used in the two approaches above.
In some circumstances, this could be easier to read and/or reason about:
def answer(question):
if not question.startswith("What is") or "cubed" in question:
raise ValueError("unknown operation")
question = question.removeprefix("What is").removesuffix("?").replace("by", "").strip()
if not question:
raise ValueError("syntax error")
formula = question.split()
while len(formula) > 1:
try:
x_value, symbol, y_value, *remainder = formula #<-- unpacking and multiple assignment.
match symbol:
case "plus":
formula = [int(x_value) + int(y_value)] + remainder
case "minus":
formula = [int(x_value) - int(y_value)] + remainder
case "multiplied":
formula = [int(x_value) * int(y_value)] + remainder
case "divided":
formula = [int(x_value) / int(y_value)] + remainder
case _:
raise ValueError("syntax error") #<-- "fall through case for no match."
except: raise ValueError("syntax error") # <-- error handling for anything else that goes wrong.
return int(formula[0])
Import Callables from the Operator Module
Import Callables from the Operator Module
from operator import add, mul, sub
from operator import floordiv as div
OPERATIONS = {"plus": add, "minus": sub, "multiplied": mul, "divided": div}
def answer(question):
if not question.startswith("What is") or "cubed" in question:
raise ValueError("unknown operation")
question = question.removeprefix("What is").removesuffix("?").strip()
if not question:
raise ValueError("syntax error")
if (question.startswith("-") and question[1:].isdigit()) or question.isdigit():
return int(question)
equation = [word for word in question.split() if word != 'by']
while len(equation) > 1:
try:
x_value, operation, y_value, *rest = equation
equation = [OPERATIONS[operation](int(x_value), int(y_value)),
*rest]
except:
raise ValueError("syntax error")
return equation[0]
This approach is nearly identical to the string, list, and dict methods approach, so it is recommended to review that before going over this one.
The two major differences are the operator module, and the elimination of the if-elif-else block.
The solution begins by importing basic mathematical operations as methods from the operator module.
These functions (floordiv is aliased to “div”) are stored in a dictionary that serves as a lookup table when the problems are processed.
These operations are later made callable by using () after the name, and supplying arguments.
In answer(), the question is first checked for validity, cleaned, and finally split into a list using str.startswith, str.removeprefix/str.removesuffix, strip, and split.
Checks for digits and an empty string are done, and the word “by” is filtered from the equation list using a list-comprehension.
The equation list is then processed in a while-loop within a try-except block.
The list is unpacked (see also concept: unpacking and multiple assignment) into x_value, operation, y_value, and *rest, and reduced by looking up and calling the mathematical function in the OPERATIONS dictionary and passing in int(x_value) and int(y_value) as arguments.
The processing of the equation list continues until it is of len() 1, at which point the single element is returned as the answer.
To walk through this step-by-step, you can interact with this code on pythontutor.com.
Using a list-comprehension to filter out “by” can be replaced with the str.replace method during question cleaning.
Implicit concatenation can be used to improve the readability of the chained method calls:
question = (question.removeprefix("What is")
.removesuffix("?")
.replace("by", "")
.strip()) #<-- Enclosing () means these lines are automatically joined by the interpreter.
The call to str.replace could instead be chained to the call to split when creating the equation list:
equation = question.replace("by", "").split()
Regex with the Operator Module
Regex and the Operator Module
import re
from operator import add, mul, sub
from operator import floordiv as div
OPERATIONS = {"plus": add, "minus": sub, "multiplied by": mul, "divided by": div}
REGEX = {
'number': re.compile(r'-?\d+'),
'operator': re.compile(f'(?:{"|".join(OPERATIONS)})\\b')
}
# Helper function to extract a number from the question.
def get_number(question):
# Match a number.
pattern = REGEX['number'].match(question)
# Toss an error if there is no match.
if not pattern:
raise ValueError("syntax error")
# Remove the matched pattern from the question, and convert
# that same pattern to an int. Return the modified question and the int.
return [question.removeprefix(pattern.group(0)).lstrip(),
int(pattern.group(0))]
# Helper function to extract an operation from the question.
def get_operation(question):
# Match an operation word
pattern = REGEX['operator'].match(question)
# Toss an error if there is no match.
if not pattern:
raise ValueError("unknown operation")
# Remove the matched pattern from the question, and look up
# that same pattern in OPERATIONS. Return the modified question and the operator.
return [question.removeprefix(pattern.group(0)).lstrip(),
OPERATIONS[pattern.group(0)]]
def answer(question):
prefix = "What is"
# Toss an error right away if the question isn't valid.
if not question.startswith(prefix):
raise ValueError("unknown operation")
# Clean the question by removing the suffix and prefix and whitespace.
question = question.removesuffix("?").removeprefix(prefix).lstrip()
# the question should start with a number
question, result = get_number(question)
# While there are portions of the question left, continue to process.
while len(question) > 0:
# can't have a number followed by a number
if REGEX['number'].match(question):
raise ValueError("syntax error")
# Call get_operation and unpack the result
# into question and operation.
question, operation = get_operation(question)
# Call get_number and unpack the result
# into question and num
question, num = get_number(question)
# Perform the calculation, using result and num as
# arguments to operation.
result = operation(result, num)
return result
This approach uses two dictionaries: one of operations imported from operators, and another that holds regex for matching digits and matching operations in the text of a question.
It defines two “helper” functions, get_number() and get_operation, that take a question and use the regex patterns to remove, convert, and return a number (get_number()) or an operation (get_operation()), along with a modified “new question”.
In the answer() function, the question is checked for validity (does it start with “What is”), and a ValueError("unknown operation") it raised if it is not a valid question.
Next, the question is cleaned with str.removeprefix & str.removesuffix, removing “What is” and ”?”.
Left-trailing white space is stripped with the help of lstrip().
After that, the variable result is declared with an initial value from get_number().
The question is then iterated over via a while-loop, which calls get_operation() and get_number() — “reducing” the question by removing the leading numbers and operator.
The return values from each call are unpacked into a “leftover” question portion, and the number or operator.
The returned operation is then made callable using (), with result and the “new” number (returned from get_number()) passed as arguments.
The loop then proceeds with processing of the “new question”, until the len() is 0.
Once there is no more question to process, result is returned as the answer.
Source: Exercism python/wordy