Odtnhj

Question

I want to achieve multiple word support.

E.g. abc AAAA cbAAaa => 000 4444 222222

For now all the program does is first word conversion.

As from debugger in Mars simulator it seems that it's doing all the loops correctly. Same for values of registers. (Maybe I'm missing something)
I assume that words need to be shorter than 10 chars.

If anyone can spot a mistake I would be grateful.
Also if you have any tips to debugging this or code improvements, feel free to say.

My code:

 .data
prompt: .asciiz "Enter a string: "
msgout: .asciiz "Output string: "
input: .space 256
output: .space 256
 .text
 .globl main

main:
 li $v0, 4 # Print enter a string prompt
 la $a0, prompt 
 syscall

 li $v0, 8 # Ask the user for the string they want to reverse
 la $a0, input # We'll store it in 'input'
 li $a1, 256 # Only 256 chars/bytes allowed
 syscall

 la $t2, ($a0) # t2 - input string

 word:
 li $t1, 0 # Normal counter
 li $t5, 0 # Uppercase counter
 li $t6, 0 # First letter of word
 j word_countUppercase
 word_precountUppercase:
 addi $t1, $t1, 1 # Add 1 to index to avoid space in next word
 la $t6, ($t1) # Set t6 to the first index of t2 (start of word)
 la $t5, 0 # $t5 - 0
 word_countUppercase:
 #addi $t1, $t1, $t7
 add $t3, $t2, $t1 # $t2 is the base address for our 'input' array, add loop index
 lb $t4, 0($t3) # load a byte at a time according to counter

 beq $t4, ' ', word_prereplace # We found end of word
 bltu $t4, ' ', word_prereplace # We found end of string 

 addi $t1, $t1, 1 # Advance our counter (i++)

 bltu $t4, 'A', word_countUppercase
 bgtu $t4, 'Z', word_countUppercase

 addi $t5, $t5, 1 # Advance our counter (i++)
 j word_countUppercase

 word_prereplace:
 la $t2, ($a0) # t2 - input string
 la $t1, ($t6) # Normal counter
 addi $t5, $t5, '0'

 word_replace:
 add $t3, $t2, $t1 # $t2 is the base address for our 'input' array, add loop index
 lb $t4, 0($t3) # load a byte at a time according to counter 

 beq $t4, ' ', word_replaceExit # end of the word
 bltu $t4, ' ', exit # We found end of string 

 sb $t5, output($t1) # Overwrite this byte address in memory 

 addi $t1, $t1, 1 # Advance our counter (i++)
 j word_replace
 word_replaceExit:
 j word_precountUppercase



exit:
 li $v0, 4 # Print msgout
 la $a0, msgout
 syscall

 li $v0, 4 # Print the output string!
 la $a0, output
 syscall

 li $v0, 10 # exit()
 syscall

score 2 · Accepted Answer · 2018-11-11 10:15:03Z

EDIT: answer to original question was, that the original code did fill in output buffer only bytes corresponding to words' content, but kept undefined memory between, which happens to be zeroed in MARS simulator, so there was accidentally zero-terminator after first word, and the "print string" service of MARS does expect zero-terminated strings = only first word was printed.

Here is my variant for the same task, using various shortcuts to do the same thing in (marginally) fewer instructions (it's still O(N) complexity).

Also I wrote it in a way to make sure inputs with multiple spaces, starting/ending with space or empty input work correctly (for "two spaces" on input it will output also "two spaces") (I didn't test all of these with your original code, so I'm not saying there is some bug, seems it should handle most of them well, I just did test thoroughly only my variant):

# delayed branching should be OFF
.data
prompt: .asciiz "Enter a string: "
msgout: .asciiz "Output string: "
input: .space 256
output: .space 256
 .text
 .globl main

main:
 li $v0, 4 # Print enter a string prompt
 la $a0, prompt
 syscall

 li $v0, 8 # Ask the user for the string they want to reverse
 la $a0, input # We'll store it in 'input'
 li $a1, 256 # Only 256 chars/bytes allowed
 syscall

 la $a1, output
 # a0 = input, a1 = output
new_word:
 move $t0, $zero # t0 word length = 0
 li $t1, '0' # t1 uppercase counter = '0' (ASCII counter)
word_parse_loop:
 lbu $t2, ($a0) # next input character
 addi $a0, $a0, 1 # advance input pointer
 bltu $t2, 33, word_done # end of word detected (space or newline)
 # "less than 33" to get shorter code than for "less/equal than 32"
 addi $t0, $t0, 1 # ++word length
 # check if word character is uppercase letter
 addiu $t2, $t2, -65 # subtract 'A' => makes t2 = 0..25 for 'A'..'Z'
 sltiu $t3, $t2, 26 # t3 = (t2 < 26) ? 1 : 0
 add $t1, $t1, $t3 # ++uppercase counter if uppercase detected
 j word_parse_loop

word_output_fill:
 # loop to fill output with uppercase-counter (entry is "word_done" below)
 sb $t1, ($a1)
 addi $a1, $a1, 1
 addiu $t0, $t0, -1
word_done:
 # t0 = word length, t1 = uppercase ASCII counter, t2 = space, newline or less
 # a0 = next word (or beyond data), a1 = output pointer (to be written to)
 bnez $t0, word_output_fill

 bltu $t2, ' ', it_was_last_word
 # t2 == space, move onto next word in input (store space also in output)
 sb $t2, ($a1)
 addi $a1, $a1, 1
 j new_word

it_was_last_word:
 # finish output data by storing zero terminator
 sb $zero, ($a1)

 # output result
 li $v0, 4 # Print msgout
 la $a0, msgout
 syscall

 li $v0, 4 # Print the output string!
 la $a0, output
 syscall

 li $v0, 10 # exit()
 syscall

Things to note ("tricks"?):

the uppercase counter is starting at value 48 (character for zero), so the "counter" does hold ASCII digit whole time (for less-than-10 count, for 10+ it will go to other characters beyond digits) and is ready to be written into string without any conversion (because the counter is not used as "integer" anywhere, you can optimize out the conversion like this).

it's advancing through input and output in sequential way, never reading some input twice or readjusting input/output position, so algorithm like this can work also with "stream"-like data (it almost does produce 1:1 output for every input character, except the output is slightly delayed "per word", i.e. it will process input stream until "end of word", then whole output word is produced (this architecture may be important with some I/O like magnetic tapes on input, and serial printer on output).

the check for A..Z uppercase letter range does use only single compare condition (the letter 'A' is subtracted from character first, normalizing the value into 0..25 for uppercase letters, everything else, when treated as unsigned integer must be of greater value than 25, so single < 26 test is enough to detect all uppercase letters.

the uppercase counter is updated every time, by adding either 0 or 1 (depending on the previously mentioned condition), which avoid extra branching in the code. Generally modern CPUs like more non-branching code, as they may more aggressively cache/speculate ahead, so in cases where the chances for branch is more like 50%, non-branching variant of code has usually better performance (for cases where branch is like 5-10% chance, branching away on that uncommon condition and staying in line for common case, may be better, i.e. things like "end of word" branches).

Or if you have any other question about particular part of code, feel free to ask.

sswwqqaa 5531314 · Answer 2 · 2018-11-10 23:51:21Z

up vote
2
down vote

I was having empty spaces in my output so the string was: 1112221111 and print was doing it to first empty space so 111 only.

To fix it I have added this code: (before word label)

li $t1, 0 # Normal counter
 rewriteoutput:
 add $t3, $t2, $t1 # $t2 is the base address for our 'input' array, add loop index
 lb $t4, 0($t3) # load a byte at a time according to counter

 bltu $t4, ' ', word # We found end of string

 sb $t4, output($t1) # Overwrite this byte address in memory
 addi $t1, $t1, 1 # Advance our counter (i++)
 j rewriteoutput

I know that we can probably do it in better way, but cannot understand why I can't do

sw $a0, output

instead of it (Error at runtime: Runtime exception at 0x0040002c: store address not aligned on word boundary 0x10010121)

answered Nov 10 at 23:51

sswwqqaa

5531314

2

it was not "empty space", it was zero, and the syscall for "print string" is expecting zero-terminated string, so it did print only first three characters. I don't understand what sw $a0, output should do and where, but obviously output address is not word-aligned, so you can't store word there, but I don't know what you want to achieve with it?
– Ped7g
Nov 11 at 0:01

so is there any shorter way of I just wrote? Or I have to do it manually as I did in output rewriting?
– sswwqqaa
Nov 11 at 0:51

1

Shorter way of what exactly? I don't see how sw $a0, output relates to your other stuff, that would store current word value (32 bits) in a0 to memory starting at address output, which is char buffer, so storing 32 bit word there is like writing four characters at same time (but it will fail because word writes require word-aligned memory address). Your fix to write space between words in output is correct one, but your overall code is quite convoluted and complex and can be done in considerably fewer instructions. But that takes months/years of experience. I'll try to post my version.
– Ped7g
Nov 11 at 1:02

1

I mean, your solution is very reasonable for somebody just learning, and for their first working version. Also it reads quite well and seems like it follows some idea and it's not just random mess of adding instructions until "it works", but trying to stay "to the point". And my point is to not stop there (first working version), and keep experimenting a bit more, if you can find different ways of writing the same task. Once you are solid in basics, asm may be quite fun, figuring out different concepts and tricks, how the same thing can be done (at least I love it).
– Ped7g
Nov 11 at 1:15

add a comment |

score 2 · Accepted Answer · 2018-11-11 10:15:03Z

EDIT: answer to original question was, that the original code did fill in output buffer only bytes corresponding to words' content, but kept undefined memory between, which happens to be zeroed in MARS simulator, so there was accidentally zero-terminator after first word, and the "print string" service of MARS does expect zero-terminated strings = only first word was printed.

Here is my variant for the same task, using various shortcuts to do the same thing in (marginally) fewer instructions (it's still O(N) complexity).

Also I wrote it in a way to make sure inputs with multiple spaces, starting/ending with space or empty input work correctly (for "two spaces" on input it will output also "two spaces") (I didn't test all of these with your original code, so I'm not saying there is some bug, seems it should handle most of them well, I just did test thoroughly only my variant):

# delayed branching should be OFF
.data
prompt: .asciiz "Enter a string: "
msgout: .asciiz "Output string: "
input: .space 256
output: .space 256
 .text
 .globl main

main:
 li $v0, 4 # Print enter a string prompt
 la $a0, prompt
 syscall

 li $v0, 8 # Ask the user for the string they want to reverse
 la $a0, input # We'll store it in 'input'
 li $a1, 256 # Only 256 chars/bytes allowed
 syscall

 la $a1, output
 # a0 = input, a1 = output
new_word:
 move $t0, $zero # t0 word length = 0
 li $t1, '0' # t1 uppercase counter = '0' (ASCII counter)
word_parse_loop:
 lbu $t2, ($a0) # next input character
 addi $a0, $a0, 1 # advance input pointer
 bltu $t2, 33, word_done # end of word detected (space or newline)
 # "less than 33" to get shorter code than for "less/equal than 32"
 addi $t0, $t0, 1 # ++word length
 # check if word character is uppercase letter
 addiu $t2, $t2, -65 # subtract 'A' => makes t2 = 0..25 for 'A'..'Z'
 sltiu $t3, $t2, 26 # t3 = (t2 < 26) ? 1 : 0
 add $t1, $t1, $t3 # ++uppercase counter if uppercase detected
 j word_parse_loop

word_output_fill:
 # loop to fill output with uppercase-counter (entry is "word_done" below)
 sb $t1, ($a1)
 addi $a1, $a1, 1
 addiu $t0, $t0, -1
word_done:
 # t0 = word length, t1 = uppercase ASCII counter, t2 = space, newline or less
 # a0 = next word (or beyond data), a1 = output pointer (to be written to)
 bnez $t0, word_output_fill

 bltu $t2, ' ', it_was_last_word
 # t2 == space, move onto next word in input (store space also in output)
 sb $t2, ($a1)
 addi $a1, $a1, 1
 j new_word

it_was_last_word:
 # finish output data by storing zero terminator
 sb $zero, ($a1)

 # output result
 li $v0, 4 # Print msgout
 la $a0, msgout
 syscall

 li $v0, 4 # Print the output string!
 la $a0, output
 syscall

 li $v0, 10 # exit()
 syscall

Things to note ("tricks"?):

the uppercase counter is starting at value 48 (character for zero), so the "counter" does hold ASCII digit whole time (for less-than-10 count, for 10+ it will go to other characters beyond digits) and is ready to be written into string without any conversion (because the counter is not used as "integer" anywhere, you can optimize out the conversion like this).

it's advancing through input and output in sequential way, never reading some input twice or readjusting input/output position, so algorithm like this can work also with "stream"-like data (it almost does produce 1:1 output for every input character, except the output is slightly delayed "per word", i.e. it will process input stream until "end of word", then whole output word is produced (this architecture may be important with some I/O like magnetic tapes on input, and serial printer on output).

the check for A..Z uppercase letter range does use only single compare condition (the letter 'A' is subtracted from character first, normalizing the value into 0..25 for uppercase letters, everything else, when treated as unsigned integer must be of greater value than 25, so single < 26 test is enough to detect all uppercase letters.

the uppercase counter is updated every time, by adding either 0 or 1 (depending on the previously mentioned condition), which avoid extra branching in the code. Generally modern CPUs like more non-branching code, as they may more aggressively cache/speculate ahead, so in cases where the chances for branch is more like 50%, non-branching variant of code has usually better performance (for cases where branch is like 5-10% chance, branching away on that uncommon condition and staying in line for common case, may be better, i.e. things like "end of word" branches).

Or if you have any other question about particular part of code, feel free to ask.

sswwqqaa 5531314 · Answer 4 · 2018-11-10 23:51:21Z

up vote
2
down vote

I was having empty spaces in my output so the string was: 1112221111 and print was doing it to first empty space so 111 only.

To fix it I have added this code: (before word label)

li $t1, 0 # Normal counter
 rewriteoutput:
 add $t3, $t2, $t1 # $t2 is the base address for our 'input' array, add loop index
 lb $t4, 0($t3) # load a byte at a time according to counter

 bltu $t4, ' ', word # We found end of string

 sb $t4, output($t1) # Overwrite this byte address in memory
 addi $t1, $t1, 1 # Advance our counter (i++)
 j rewriteoutput

I know that we can probably do it in better way, but cannot understand why I can't do

sw $a0, output

instead of it (Error at runtime: Runtime exception at 0x0040002c: store address not aligned on word boundary 0x10010121)

answered Nov 10 at 23:51

sswwqqaa

5531314

2

it was not "empty space", it was zero, and the syscall for "print string" is expecting zero-terminated string, so it did print only first three characters. I don't understand what sw $a0, output should do and where, but obviously output address is not word-aligned, so you can't store word there, but I don't know what you want to achieve with it?
– Ped7g
Nov 11 at 0:01

so is there any shorter way of I just wrote? Or I have to do it manually as I did in output rewriting?
– sswwqqaa
Nov 11 at 0:51

1

Shorter way of what exactly? I don't see how sw $a0, output relates to your other stuff, that would store current word value (32 bits) in a0 to memory starting at address output, which is char buffer, so storing 32 bit word there is like writing four characters at same time (but it will fail because word writes require word-aligned memory address). Your fix to write space between words in output is correct one, but your overall code is quite convoluted and complex and can be done in considerably fewer instructions. But that takes months/years of experience. I'll try to post my version.
– Ped7g
Nov 11 at 1:02

1

I mean, your solution is very reasonable for somebody just learning, and for their first working version. Also it reads quite well and seems like it follows some idea and it's not just random mess of adding instructions until "it works", but trying to stay "to the point". And my point is to not stop there (first working version), and keep experimenting a bit more, if you can find different ways of writing the same task. Once you are solid in basics, asm may be quite fun, figuring out different concepts and tricks, how the same thing can be done (at least I love it).
– Ped7g
Nov 11 at 1:15

add a comment |

搜尋此網誌

Odtnhj

Replace each character of the word by the number of uppercase characters in this word

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

Spillway

A major

Replace each character of the word by the number of uppercase characters in this word

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

這個網誌中的熱門文章

How to read a connectionString WITH PROVIDER in .NET Core?

Spillway

A major

2 Answers
2

2 Answers
2

2 Answers
2