Removing punctuation from start and end of string

Nov 16, 2018 at 4:22pm
I would like to remove the punctuation at the start of a string and at the end.
My code so far:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
string remPunc(string &word)
{
    for(int i = 0; i < word.length(); i++){
        if(ispunct(word[i])){
            word.erase(i, 1);
        }
    }
    return word;
}


int main(){
    string word("&&&Hello&&&");
    remPunc(word);
    cout << word << endl;
}


This currently outputs:
 
&Hello&


I don't understand why the loop doesn't erase the punctuation just before and after the string. Also if I change the string to just "&Hello&", the punctuation is removed. Would appreciate some help as to what's happening here, thanks!

Last edited on Nov 16, 2018 at 4:23pm
Nov 16, 2018 at 5:02pm
You erase an element, but your index i doesn't take this into account. You need to update your index as well after erasing an element, since everything gets shifted to the left.
Nov 16, 2018 at 5:03pm
The first iteration through, you remove the first character, leaving the string "&&hello&&&". The next iteration you are looking for index 1 (which is now the second '&'. It gets removed, and you have "&hello&&&". Now you are looking at index 2 which is the 'e'.

Nov 16, 2018 at 5:52pm
If you run the for loop in reverse it will work.
Nov 16, 2018 at 6:28pm
closed account (z05DSL3A)
If you want to only remove punctuation from start and end of string look at std::string::find_first_not_of()[1] and std::string::find_last_not_of()[2] to use with std::string::erase().

__________________________________________________________
[1] http://www.cplusplus.com/reference/string/string/find_first_not_of/
[2] http://www.cplusplus.com/reference/string/string/find_last_not_of/
Nov 16, 2018 at 7:29pm
I recommend being careful erasing/inserting elements while iterating through a container. I would just avoid it when possible.

I would just do something like this, which is very readable and behaves in a very easily understandable way.

1
2
3
4
5
6
7
8
string remPunc(const string& word)
{
    string result;
    for (auto ch: word)
        if (!ispunct(ch))
            result += ch;
    return result;
}


Nov 16, 2018 at 8:39pm
I agree, that's much less error-prone.
Nov 17, 2018 at 8:21am
For a real app I would recommend to use the reserve function to avoid additional memory allocations.
1
2
3
4
5
6
7
string remPunc(const string& word)
{
    string result;
    result.reserve.word.size());

    // rest of above code
}
Nov 17, 2018 at 10:20am
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
#include <iostream>
#include <string>
#include <cstring>
using namespace std;

string trimPunct( const string &str )
{
   int p = 0, q = str.size() - 1;
   while ( p <= q && !isalnum( str[p] ) ) p++;
   while ( p <= q && !isalnum( str[q] ) ) q--;
   if ( p > q ) return "";
   else         return str.substr( p, q - p + 1 );
}

int main()
{
   string tests[] = { "", "   ", "  &&&Hello", "&Hello  !", "R2D2 isn't here!!", "none" };
   for ( string s : tests ) cout << "[" << s << "] -> [" << trimPunct( s ) << "]\n";
}


[] -> []
[   ] -> []
[  &&&Hello] -> [Hello]
[&Hello  !] -> [Hello]
[R2D2 isn't here!!] -> [R2D2 isn't here]
[none] -> [none]
Last edited on Nov 17, 2018 at 10:26am
Nov 17, 2018 at 10:46am
closed account (z05DSL3A)
This defaults to whitespace trimming* but you can give it a string of what you want...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#include <iostream>
#include <string>

std::string& ltrim(std::string& str, const std::string& chars = "\t\n\v\f\r ")
{
    str.erase(0, str.find_first_not_of(chars));
    return str;
}

std::string& rtrim(std::string& str, const std::string& chars = "\t\n\v\f\r ")
{
    str.erase(str.find_last_not_of(chars) + 1);
    return str;
}

std::string& trim(std::string& str, const std::string& chars = "\t\n\v\f\r ")
{
    return ltrim(rtrim(str, chars), chars);
}

int main()
{
    std::string strings[] = { " ", "test string", " test string", "test string ",
                              " test string ", "  test string  " };

    for (size_t i = 0; i < 6; i++)
    {
        std::cout << "|" << strings[i] << "|\n";
        std::cout << "|" << trim(strings[i]) << "|\n\n";
    }

}

| |
||

|test string|
|test string|

| test string|
|test string|

|test string |
|test string|

| test string |
|test string|

|  test string  |
|test string|



______________________________________
* I find I need to trim whitespace more often than I trim punctuation. ;0)
Topic archived. No new replies allowed.